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ACES  PRESIDENT'S  STATEMENT 


It's  nice  to  be  here  in  California  in  March,  especially  since  we  weren't  entirely  certain  that  there  was  going 
to  be  a  California  in  March.  This  state  seems  to  attract  more  than  its  fair  share  of  calamities,  ranging  from 
floods,  fires,  mud  slides,  and  OJ-things. 

In  any  case,  we're  back  at  ACES'95,  the  Eleventh  Annual  Review  of  Progress  in  Applied  Computational 
Electromagnetics,  which  reverts  to  the  Naval  Postgraduate  School,  our  long-time  home. 

Ray  Luebbers  and  his  committee  have  put  together  an  excellent  program,  one  which  embraces  low 
frequency'  CEM  to  'high  frequency  CEM.  The  subject  matter  to  be  discussed  ranges  from  the  traditional 
scattering  and  radiation  problems,  so  typical  of  ACES  past,  to  optimization  techniques  in  applied 
electromagnetics.  This  area  has  been  of  interest  to  designers  of  electrical  machines,  but  we  see  that  it  is 
being  applied  to  hyperthermia  and  superconducting  accelerator  magnets. 

I  am  pleased  to  see  ACES  embrace  all  aspects  of  computational  electromagnetics,  for  there  Is  no  single 
'natural'  milieu  of  CEM  for  ACES.  The  problems  of  CEM  are  many  and  varied.  If  one  has  an  interest  in  solving 
Maxwell's  equations,  he  should  be  at  home  at  any  frequency,  in  any  environment. 

You  will  note  that  Ken  Siarkiewicz,  a  long-time  dedicated  supporter  of  ACES,  has  put  together  an  interesting 
session  entitled  'Research  and  Engineering  Framework  for  CEM'.  Ken  then  did  himself  proud  by  releasing 
$10,000  of  MMACE  (millimeter-wave,  microwave  advanced  computational  environment)  funding  so  that 
ACES  could  properly  support  this  session.  This  tells  me  two  things:  first,  that  Ken  is  a  pretty  good  man  to  have 
on  your  side,  and  second,  that  ACES  is  increasingly  being  viewed  as  the  preeminent  vehicle  to  expose 
issues  in  computational  electromagnetics. 

ACES  is  reaching  this  stature  because  of  the  dedication  of  its  members  especially  those  who  volunteer  to 
do  its  work.  When  you  publish  next,  think  about  the  ACES  Journal  or  Newsletter  as  the  publication  of  choice. 
When  you  wish  to  give  your  profession  extracurricular  support,  consider  working  for  ACES.  A  lot  of  people 
have,  and  a  lot  more  will. 

Enjoy  the  Eleventh  Annual  Review. 

Harold  A.  Sabbagh 
Sabbagh  Associates,  Inc. 

4635  Morningside  Drive 
Bloomington,  IN  47408 
(812)  339-8273 
(812)339-8292  FAX 
email:has@sabbagh.com 
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ACES  1995  SHORT  COURSES 

MONDAY  MARCH  20  FULL-DAY  COURSE 

0830- 1 630  “Finite  Elements  for  Electromagnetics' 

by  John  Brauer,  MacNeal  Schwendler  Corporation 

0830-1630  "GEMACS  From  A-Z"  by  Buddy  Coffey,  Advanced  EM. 

0830-1630  "Physical  Wavelets"  by  Gerald  Kaiser,  University  of  Massachusetts  at  Lowell. 

HALF-DAY  COURSE 

1300-1630  "The  Multiple  Multipole  Program  (MMP):  Theory,  Practical  Use  and  Latest  Features" 
by  Pascal  Leuchtmann,  Swiss  Federal  Instiitute  of  Technology 

1300-1630  "Verification  and  Validation  of  Computational  Software"  by  E.K,  Miller,  Ohio  University. 

SATURDAY  MARCH  25  FULL-DAY  COURSE 

0830- 1 630  "Using  Mathematical  Software  for  Computational  Electromagnetics" 
by  Jovan  Lebaric,  Naval  Postgraduate  School. 

0830- 1 630  "Wire  Antenna  Modeling  Using  NEC "  by  Dick  Adler,  Naval  Postgraduate  School,  Jim 
Breakall,  Penn  State  University,  and  Gerry  Burke,  Lawrence  Livermore  National  Lab. 

0830-1630  "FDTD,  Generalized  FDTD  and  FVTD  Techniques  in  Solving  Maxwell's  Equations" 
by  Kane  Yee,  Lockheed. 
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FINAL  AGENDA 


The  Eleventh  Annual  Review  of  Progress  in  Applied  Computational  Electromagnetics 

NAVAL  POSTGRADUATE  SCHOOL 
20  -  25  MARCH,  1995 

Raymond  Luebbers,  Technical  Program  Chairman 
Richard  Gordon,  Proceedings  Editor 
Robert  Lee,  Short  Course  Chairman 
Paul  Goggans,  Publicity  Chairman 
Richard  Adler,  Conference  Facilitator 


MONDAY  20  MARCH 

0830-1630  SHORT  COURSE  (FULL-DAY) 

"Finite  Elements  for  Electromagnetics” 

0830-1630  SHORT  COURSE  (FULL-DAY) 

"GEMACS  from  A-Z" 

0830-1630  SHORT  COURSE  (FULL-DAY) 

"Physical  Wavelets" 

1300-1630  SHORT  COURSE  (HALF-DAY) 

"The  Multiple  Multipole  Program  (MMP):  Theory,  Practical  Use 
and  Latest  Features" 


1300-1630  SHORT  COURSE  (HALF-DAY) 

"Verification  and  Validation  of  Computational  Software" 

0800-2030  CONFERENCE  REGISTRATION 


TUESDAY  21  (VIARCH 


0700  CONFERENCE  REGISTRATION 

0700  0800  CONTINENTAL  BREAKFAST 

0800  WELCOME  Raymond  Luebbers 

SESSION  1 :  SCATTERING  (parallel  with  Sessions  2  and  3) 

CHAIRS:  V.  CABLE.  E.  MILLER 


0840  “A  CGFFT  Method  Applied  to  the  Scatttering  from  Finite  Size 
Microstrip  Antenna" 

0900  "Analysis  of  Scattering  by  Cluster  of  Nonspherical 
Particles  Based  on  Complete  Mathematic  Models” 


0920  "Analytic  Solution  for  Calculating  the  Radar  Cross-Section  and  Related 
Parameters  of  a  Conducting  Right  Circular  Cylinder  Surrounded  by 
Multiple  Layers  of  Lossy  Dielectrics" 

0940  "RCS  of  High  Permittivity  Cubes  Computed  with  the  TLM 
Method" 


1 000  BREAK 

1020  "Scattering  Analysis  of  Antenna  Installations/Panels  on  a  Curved 
Surface  Using  Uniform  Field  Integration  Method" 


1040 


"Code  Validation  of  Aircraft  Scattering  Parameters  using  IR 
Thermograms" 
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102  Glasgow 

John  Brauer,  MacNeal-Schwendler  Corporation 
122  Ingersoll 

Buddy  Coffey,  Advanced  EM 
101  A  Spanagel 

Gerald  Kaiser,  Univ.  of  Massachusetts  at  Lowell 
325  Ingersoll 

Pascal  Leuchtmann,  Swiss  Federal  Institute 
of  Technology 

323  Ingersoll 

E.  K.  Miller,  Ohio  University 

103  Glasgow  Hall 


103  Glasgow  Hall 


102  Glasgow  Hall 
122  Ingersoll  Hall 


A.  McCowen 


Y.A.  Eremin,  N.W.  Orlov  and 
V.l.  Rozenberg 

G.W.  Jarriel,  Jr.,  M.  E.  Baginski, 
and  Lloyd  Riggs 


C.  Eswarappa  and  W.J.R.  Hoefer 


J.J.  Kim  and  O.B.  Kesler 


J.  Norgard,  R.  Sega,  M.  Seifert, 
T.  Blocher  and  A.  Pesta 


TUESDAY  21  MARCH 

SESSION  1;  SCATTERING  (parallel  with  Sessions  2  and  3)  (CONT) 


122  Ingersoll  Hall 


1100 

"A  New  Method  for  Solving  Scattering  Problems  with  Conducting  Media  in 
the  Time  Domain" 

M.  Schinke  and  K.  ReiG 

1120 

"Experience  and  Experiments  at  Cray  Research  with  JUNCTION-2" 

J.A.  Crow  and  Q.M.  Sheikh 

1140 

"Quantitative  Methods  for  Measuring  and  Improving  the 

Performance  of  Electromagnetic  Scattering  Codes" 

J.P.  Meyers,  A.  J.  Terzuoli,  Jr., 
and  G.C.  Gerace 

LUNCH 

SESSION  2:  LOW  FREQUENCY  (parallel  with  Sessions  1  and  3) 

CHAIRS:  K.  KUNZ,  H.  SABBAGH 

117  Spanagel  Hall 

0840 

"Numerical  Modelling  of  EMC  in  Underground  Power  Cable 

Systems  with  the  Hybrid  FE-BE  Method" 

J.  Shen  and  A.  Kost 

0900 

"New  Contribution  to  the  Study  of  Fault  Currents  Distribution  in 
the  Ground  Systems" 

H.O.  Brodskyn,  M.H.  Giarolla,  J.R. 
Cardoso,  N.M.  Abe  and  A.  Passaro 

0920 

"On  the  Oscillatory  Phenomena  of  Eddy  Currents  Along 
the  A,  V-'E  Inferface" 

Z.  Cheng,  Q.  Hu,  S.  Gao,  Z.  Liu, 

M.  Wu  and  C.  Ye 

0940 

"A  New  MMP-Code  for  Static  Field  Computation" 

M.  Gnos  and  P.  Leuchtmann 

1000 

BREAK 

1020 

"Molten  Aluminum  Flow  Induced  by  High  Magnetic  Fields" 

W.P.  Wheless,  Jr.  and  C.S.  Wheless 

1040 

"The  Electrostatic  Characterization  of  a  N-Element  Planar 

Array  Using  the  Singularity  Expansion  Method" 

J.E.  Mooney  and  L.  Riggs 

1100 

"A  Volume-Integral  Code  for  Electromagnetic  Nondestructive 

Evaluation" 

R.K.  Murphy,  H.A.  Sabbagh, 

J.C.  Treece  and  L.  W.  Woo 

LUNCH 

SESSION  3:  RESEARCH  AND  ENGINEERING  FRAMEWORK  FOR  CEM  (parallel  with  Sessions  1  and  2)  102  Glasgow  Hall 
ORGANIZER:  K.  SIARKIEWICZ 

0840 

"Research  and  Engineering  Framework  (REF)  for  Computational 
Electromagnetics"  (Invited) 

B.  Hantman,  K.  Siarkiewicz, 

J.  Labelle  and  R.  Jackson 

0900 

"Research  &  Engineering  Framework  (REF)  Data  Dictionary 

Specification  for  Computational  Electromagnetics"  (Invited) 

J.A.  Evans 

0920 

"DT  NURBS  -  A  Geometry  Engine  for  Integration  of  the 

MMACE  Data”  (Invited) 

B.  Ames  and  C,  Whitcomb 

0940 

"Standardized  Grid  Generation  for  the  Research  and 

Engineering  Framework"  (Invited) 

L.W.  Woo,  H.  A.  Sabbagh, 

J.  LaBelle  and  B.  Hantman 

1000 

BREAK 

1020 

"Visualization  and  Standards"  (Invited) 

J.  Cugini 

1040 

"A  Visualization  Toolkit  for  Computational  Electromagnetics"  (Invited) 

B.  Joseph 

1100 

"MMACE  -  Lessons  for  the  Development  of  a  CEM  Computational 
Environment"  (Invited) 

R.G.  Hicks  and  K.R.  Siarkiewicz 

LUNCH 

1200 

BOARD  OF  DIRECTORS  MEETING 

Terrace  Room,  Herrmann  Hall 
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1 330-1 730  VENDOR  BOOTHS  AND  WINE  AND  CHEESE  BUFFET 
1 800  HAPPY  HOUR  (NO  HOST) 

1 900  AWARDS  BANQUET 

1 330-1  530  SHORT  COURSE  (PARTIAL  DAY,  NO  FEE)  Bob  Bevensee 
"Time  Series  Analyses  of  Equity  Stock  Prices 
and  a  Profitable  Investment  Strategy” 

1  330-1  530  SESSION  4:  INTERACTIVE  TECHNICAL  SESSION 

SESSION  4A:  EM  THEORY  I 

"Pulse  Basis  Function  Implementation  of  the  Radiation 
Condition  Integral  Equations" 

"Finite  Difference  Solutions  of  Geometrical  Optics  and 
Some  Related  Nonlinear  PDEs  Approximating  High 
Frequency  Helmholtz  Equation" 

"Conversion  of  Mechanical  Energy  to  Electromagnetic  Energy" 
"Block'Toeplitz-Structure-Based  Solution  Strategies  for  CEM  Problems' 


Barbara  McNitt  Ballroom,  Herrmann  Hall 
Barbara  McNitt  Ballroom,  Herrmann  Hall 
Barbara  McNitt  Ballroom,  Herrmann  Hall 
102  Glasgow  Hall 

Barbara  McNitt  Ballroom,  Herrmann  Hall 
Barbara  McNitt  Ballroom,  Herrmann  Hall 
P.C.  Colby 

E,  Fatemi,  B.  Engquist  and  S.  Osher 

R.M.  Bevensee 

V.l.  Ivakhnenko  and  E.E.  Tyrtyshnikov 


"The  Two-Dimensional  Finite  Integral  Technique  Combined 
with  the  Measured  Equation  of  Invariance  Applied  to 
Transverse  Electric  Open  Region  Scattering  Problems" 

"Artificial  Transparent  Boundaries  in  Computational  Quasioptics" 

"A  Statistical  Electromagnetics  (STEM)  Research  Initiation  Report" 

"Optimization  of  Aperiodic  Conducting  Grids” 

"Accurate  MOM  Scattering  Calculations  Using  Massively  Parallel 
Computation" 

"A  New  Angle  on  a  Low  Cost  Ground  Screen  for  Model  Testing  in  the 
Undergraduate  Antennas  Laboratory  (Looking  at  Near  Vertical 
Incidence  Skywaves  (NVIS)  for  a  Coast  Guard  Patrol  Boatl" 

"Efficient  Extraction  of  the  Near-Field  from  CGFFT  Methods  Applied  to 
Scatterers  in  the  Resonance  Region" 

SESSION  4B:  VISUALIZATION  &  INTERFACES 


G.K.  Gothard  and  S.M.  Rao 

A.V.  Popov 

W.P.  Wheless,  Jr.,  C.B.  Wallace  and  W.D.  Prather 
R.L.  Haupt 

L. D.  Vann  and  J.S.  Bagby 

M. E.  McKaughan,  W.M. 

Randall  and  B.  Nutter 

A.  McCowen 

Barbara  McNitt  Ballroom,  Herrmann  Hall 


"Computer  Code  for  Field  Calculation  and  Visualization  in  Quasioptics" 
"Dosimetry  in  a  Voxel  Model  of  the  Head" 

"A  Graphical  User  Interface  for  the  NEC-BSC" 

"MF  Communication  and  Broadcast  Prediction  System” 


Y.V.  Kopylov 
P.J.  Dimbytow 

L. W.  Henderson  and  R.J.  Marhefka 

M. J.  Packer  and  A.P.  Tsitsopoulos 


"A  Finite  Difference  Time  Domain  Visualization  Tool 
for  Microsoft  Windows^*'^" 


A.  Z.  Elsherbeni, 

C.D.  Taylor,  Jr.  and  C.  E.  Smith 


SESSION  4C:  VALIDATION 


Barbara  McNitt  Ballroom,  Herrmann  Hall 


"Transformable  Scale  Aircraft-Like  Model  for  the  Validation 
of  Computational  Electromagnetic  Models  and  Algorithms;  Initial 
Configuration  and  Results" 

"Measurement  Study  for  Validation  of  Electromagnetic  Scattering  Codes 
on  a  Complex  30  Target" 

"Validation  Using  a  Moment  Method  Approach  with  Exact  Object 
Representation” 


D.R.  Pfiug  and  D.  Warren 


T.  Kienberger  and  D.  Jurgens 


J.A.  Larsson,  S.  Ljung  and 
B.Wahlgren 


IR  Measurements  for  Validating  EM  Analysis  Tools" 


M.  Seifert,  T.  Blocher  and  A.  Pesta 


SESSION  4D;  EMI/EMC/EMP 


Barbara  McNitt  Ballroom,  Herrmann  Half 


"Analysis  of  Electromagnetic  Interference  at  an  Ocean  Observation  Post"  L.  Bai  and  J.F,  Dai 


"Enforcing  Correlation  on  Statistically  Generated  EM  Cable  Drivers" 


R.  Holland  and  R.  St.  John 


"Analysis  of  Different  Contributions  to  the  Coupling  Between  Reflector  C.  Park  and  P.  Ramanujam 
Antennas  on  a  Satellite" 

"Simple  Radiation  Models  in  Lieu  of  EMC  Radiated  Emissions  Testing"  R.  Perez 

WEDNESDAY  MORNING  22  MARCH 
0730  CONTINENTAL  BREAKFAST 


0800  ACES  BUSINESS  MEETING 


President  Hal  Sabbagh  102  Glasgow  Hall 


SESSION  5:  OPTIMIZATION  TECHNIQUES  IN  APPLIED  ELECTROMAGNETICS  (parallel  with  Sessions  6  and  7)  361  Ingersoll  Hall 
ORGANIZER;  O.A.  MOHAMMED 

0840  "An  Optimization  Approach  to  Reduce  the  Discretization  Error  in  Finite  M.  Feliziani,  E.  Latini,  F.  Maradei 

Element  Explicit  Solution  Sceme"  (Invited) 


0900  "Analysis  and  Design  of  a  Reentrant  Resonant  Cavity  Applicator  for 
Radio  Frequency  Hyperthermia  System"  (Invited) 


Y.  Kanai,  T.  Tsukamoto,  K.  Toyama, 
T.  Kashiwa,  Y.  Saitoh  and 
M.  Miyakawa 


0920  "Analysis  of  Loaded  Cavities  Using  the  Constitutive  Error  Approach" 
(Invited) 


R.  Albanese,  R.  Fresa,  R.  Martone 
and  G.  Rubinacci 


0940  "The  Design  of  Electromagnetic  Devices  using  Knowledge  Based  Systems  D.A.  Lowther,  D.  N.  Dyck  and 
and  Sensitivity  Information"  (Invited)  R.  Rong 

1000  BREAK 

1020  "A  Computer  Program  for  the  Design  of  Superconducting  Accelerator  S.  Russenschuck 

Magnets"  (Invited) 

1040  "Application  of  Optimization  to  the  Design  of  Electromechanical  Devices"  J.K.  Sykulski  and  Y.B.  Cheng 
(Invited) 

1 100  "Genetic  Algorithms  for  the  Optimal  Design  of  Electromagnetic  Devices"  O.A. Mohammed,  G.F.  Ofer 
(Invited) 

1140  "Linear  Constraints  -  Gradient  Technique  for  the  Inverse  Problem  of  A. A.  Arakadan  and 

Design  Optimization"  (Invited)  S.  Subramaniam-Sivanesan 


SESSION  6:  COMPUTATIONAL  ELECTROMAGNETICS  APPLIED  TO  SHIP  DESIGN  (parallel  with  Sessions  5  and  7)  102  Glasgow  Hall 
ORGANIZERS;  J.  NEWCOMB  AND  J.  LOGAN 


0840  "The  Naval  Sea  Systems  Command 

Electromagnetic  Engineering  Program"  (Invited) 

0900  "EM  Engineering  System  Architecture"  (Invited) 

0920  "EM  Engineering  Ray  Tracing  and  Casting  Model 
RTC"  (Invited) 

0940  "Ship  Transition-Frequency  EM  Environment  Analysis 
Requirements"  (Invited) 

1000  BREAK 


D,  Cebulski,  N,  Baron  and  J.  Eadie 


1020  "Finite  Volume  Time  Domain  Analysis  of  Ship  Topside  EM 
Environment  Features"  (Invited) 

1040  "EM  Engineering  Ship  End-To-End  Application"  (Invited) 


B.  Hall,  A.  Mohammadian,  C.  Rowell 
and  V.  Shankar 


L.  R.  Carlson,  C.  F.  duster,  G.  R.  Allen 


XX 


WEDNE 

SDAY  MORNING  22  MARCH 

SESSION  6;  {CONT)  COMPUTATIONAL  ELECTROMAGNETICS  APPLIED  TO  SHIP  DESIGN  (parallel  with  Sessions  5  and  7)  102  Glasgow 

1100 

"EM  Engineering  Applied  to  Patrol  Craft  (PC-1)"  (Invited) 

D.  Tam,  J.  McGee,  C.  Azu  and  M.  Soyka 

1120 

"Shipboard  Antenna  Pattern  Visuali2ation  and  Analysis"  (Invited) 

L.C.  Russell,  J.C.  Logan, 

J.W,  Rockway  and  D.F.  Schwartz 

LUNCH 

SESSION  7:  FINITE  DIFFERENCE  TIME  DOMAIN  (parallel  with  Sessions  5  and  6) 
ORGANIZER:  J.  BEGGS 

122  Ingersoll  Hall 

0840 

"Computational  Analysis  of  Radiation  from  an  Elliptical  Shaped 

End  Radiator"  (Invited) 

S.  A.  Blocher,  E.  A.  Baca  and 

J.  H.  Beggs 

0900 

"A  Time  Domain  Harmonic  Oscillator  Model  for  an  FDTD  Treatment  of 

Lossy  Dielectrics"  (Invited) 

K.  S.  Kunz 

0920 

"FDTD  Modeling  of  Electromagnetic  Wave  interactions  with  Composite 
Random  Sheets”  (Invited) 

J.G.  Maloney  and  B.L.  Shirley 

0940 

"An  Improved  Near  to  Far  Field  FDTD  Algorithm"  (Invited) 

K.  S.  Kunz 

1000 

BREAK 

1020 

"Unstructured  Finite-Volume  Modeling  in  Computational  Electromagnetics" 
(Invited) 

DJ.  Riley  and  C.D.  Turner 

1040 

"Scattering  from  Coated  Targets  Using  a  Frequency- 
Dependent,  Surface  Impedance  Boundary  Condition  in  FDTD" 

C.W.  Penney,  R.J.  Luebbers  and 

J.W.  Schuster 

1100 

"Hybrid  Finite  Difference  Time  Domain  and  Finite  Volume  Time  Domain  in 
Solving  Maxwell’s  Equations"  (Invited) 

K.S.  Yee  and  J.S.  Chen 

1120 

"Reducing  the  Number  of  Time  Steps  Needed  for  FDTD  Antenna  and 
Microstrip  Calculations"  (Invited) 

R,  Luebbers  and  H.S.  Langdon 

1140 

"Numerical  Simulations  of  Light  Bullets,  Using  the  Full  Vector,  Time 
Dependent,  Nonlinear  Maxwell  Equations" 

P,  Goorjian  and  Y.  Silberberg 

LUNCH 

WEDNESDAY  AFTERNOON  22  MARCH 

SESSION  8:  BERENGER'S  BOUNDARY  CONDITION  (parallel  with  Sessions  10  and  11 
ORGANIZER:  J.  FANG 

)  1 02  Glasgow  Hall 

1320 

“Ultrawideband  Termination  of  Waveguilding  and  Multilayer 

Structures  for  FD-TD  Simulations  in  2-D  and  3-0"  (Invited) 

C.E.  Reuter,  R.M.  Joseph,  E.T. 

Thiele,  D.S.  Katz  and  A.  Taflove 

1340 

"A  3-D  Perfectly  Matched  Medium  by  Coordinate  Stretching 
and  Its  Absorption  of  Static  Fields"  (Invited) 

W.C.  Chew,  W.H.  Weedon 
and  A,  Sezginer 

1400 

"Perfectly  Matched  Anisotropic  Absorbers  for  Finite  Element 

Applications  in  Electromagnetics" 

D.M.  Kingsland,  Z.S.  Sacks 
and  J.F.  Lee 

1420 

"Modification  of  Berenger's  Perfect  Matched  Layer  for  the 

Absorption  of  Electromagnetic  Waves  in  Layered  Media" 

M.  Gribbons,  S.K.  Lee 
and  A.C.  Cangellaris 

1440 

"Performance  of  the  Perfectly  Matched  Layer  in  Modeling 

Wave  Propagation  in  Microwave  and  Digital  Circuit  Interconnects" 

Z.  Wu  and  J.  Fang 

1500 

BREAK 

xxi 

WEDNESDAY  i 


)N  22  MARCH 


SESSION  9:  TIME  DOMAIN/FDTD  (parallel  with  SessionslO,  11  and  12) 
CHAIRS:  L.  LONG,  J.  MALONEY 


102  Glasgow  Ha 


1  520  "A  FVTD  Algorithm  for  Maxwell's  Equations  on 
Massively  Parallel  Machines" 

1  540  "The  Piecewise  Linear  Recursive  Convolution  Method  for 
Incorporating  Dispersive  Media  into  FDTD" 

1600  "Combining  Different  Coordinate  Systems  in  the  Time 
Domain  Finite  Difference  Method” 


V.  Ahuja  and  L.N.  Long 


D.F.  Kelley  and  R.  J.  Luebbers 


M.  Mrozowski,  M.  Okoniewski, 
M.A.  Stuchly  and  S.S.  Stuchly 


1620  "Time  Domain  Response  of  Simulated  2D  Composite  Scatterers" 

1640  "An  Object-Oriented  Approach  to  Writing 
Computational  Electromagnetics  Codes" 


A.Z,  Elsherbeni  and  P.  Goggans 


M.  Zimmerman  and  P.  Mallasch 


SESSION  10:  FAST  ALGORITHMS  FOR  COMPUTATIONAL  ELECTROMAGNETICS  (parallel  with  Sessions  8,  9,  11,  and  12) 


ORGANIZERS:  E.  MICHIELSSEN  AND  W.  CHEW 


122  Ingersoll  Hall 


1320  "On  the  Use  of  Waveiet-Like  Basis  Functions  in  the  Finite  Element 
Analysis  of  Elliptic  Problems"  (Invited) 

1340  "Fast  Wavelet  Algorithm  (FWA)  for  Moment  Method  Analysis  of 
Electromagnetic  Problems"  (Invited) 


K.  Sabetfakhri  and  L.P.B.  Katehi 


1400  "Fast  Far  Field  Approximation  for  Calculating  the  RCS  of  Large  Objects"  C.C.  Lu  and  W.C.  Chew 

(Invited) 

1420  "The  Parameter  Estimation  Technique  (PET):  Speeding  Up  Dense  Matrix  C.  Hafner  and  J.  Frdhiich 

Methods"  (Invited) 

1440  "A  Novel  Scheme  for  Massively  Parallel  Solution  of  Maxwell’s  Equations  M.A.  Jensen,  Y.  Rahmat-Samii 
using  FDTD"  (Invited)  and  A.  Fijany 

1500  BREAK 

1520  "Reduction  of  the  Filling  Time  of  Method  of  Moments  Matrices"  (Invited)  G.  Vecchi,  P.  Pirinoli,  L.  Matekovits  and  M. 

Orefice 


1540  "The  Fast  Multipole  Method  for  Large  2d  Scatterers"  (Invited) 


L. R.  Hamilton,  J.J.  Ottusch, 

M. A.  Stalzer,  R.S.  Turley, 

J.L.  Visher  and  S.M.  Wandzura 


1600  "A  Multilevel  Matrix  Decomposition  Algorithm  for  Analyzing  Scattering  E.  Michielssen  and  A.  Boag 

from  Large  Structures"  (Invited) 

1620  "A  3D  Fast  Multipole  Method  for  Electromagnetics  with  Multiple  Levels"  B.  Dembart  and  E.  Yip 

(Invited) 


1640  "Fast  Multipole  Method  Solution  of  Combined  Field  Integral  Equaltion" 


J.M.  Song  and  W.  C.  Chew 


SESSION  11:  MICROWAVE  AND  GUIDED  WAVE  (parallel  with  Sessions  8  and  10)  361  Ingersoll  Hall 

CHAIRS:  P.  GOGGANS,  A.  TERZUOLI 

1320  "Computer-Simulation  of  Isotropic,  Two-Dimensional  Guided-Wave  R.A.  Speciale 

Propagation" 

1340  "Analysis  of  Ultra-Short  Pulse  Propagation  on  Uniform  and  Tapered  Printed  R.A.O.  Veliz  and  J.R.  Souza 
Transmission  Lines" 


1400  "Wave-Field  Patterns  on  Electrically  Large  Networks" 

1420  "Scattering  Characteristics  of  Dissimilar  Waveguide  Slot  Couplers" 
1440  "An  Alternative  Formulation  of  the  Tranverse  Resonance  Technique" 


A.  Singh  and  K.S.  Christopher 

A.G.  Neto,  S.  Ariguel,  H.  Aubert, 
D.  Bajon  and  H.  Baudrand 


1500  BREAK 


XXII 


22  MARCH 


SESSION  12:  MOM  (parallel  with  Sessions  9  and  10|  361  Ingersoll  Hall 

CHAIRS;  A.  PETERSON,  R.  ZIOLKOWSKI 

1  520  "Moment  Method  Analysis  of  Dielectric  Covered  Radiating  Slots  Using  S.  Christopher,  V.V.S.  Prakash, 

Alternative  Green's  Function  Approach"  A.K.  Singh  and  N.  Balakrishnan 

1540  "Computation  of  E-field  Distribution  of  Low  Gain  J.  Liu,  J.  Wang  and  Y.  Gao 

Antenna  on  Conducting  Body  of  Revolution" 

1600  "An  Implementation  of  an  Exact  Scheme  for  Problem  Decomposition  D.L.  Wilkes,  C-C.  Cha  and  T.  Krauss 

Via  the  Use  of  Aperture  Admittance" 

1620  "Parallelization  of  the  Parametric  Patch  Moment  Method  Code"  X.  Shen,  G.E.  Mortensen,  C.C.  Cha, 

G.  Cheng  and  G.  C.  Fox 

1640  "A  Tool  Box  for  Parallelization  of  Moments  Method  Codes"  E.  Yip,  B.  Blakely,  L.  Johnson, 

D,  Jurgens  and  R.  Kochhar 

THURSDAY  MORNING  23_IVIARCH 

0730  CONTINENTAL  BREAKFAST 

SESSION  13:  RECENT  DEVELOPMENTS  IN  FDTD  ANALYSIS  (parallel  with  Sessions  14  and  1 5)  102  Glasgow  Hall 

ORGANIZERS:  M.  PIKET-MAY  AND  D.  KATZ 

0840  "Simulation  of  Microwave  Circuits  by  FDTD  Method"  (Invited)  C.N.  Kuo,  B.  Houshmand  and  T.  Itoh 

0900  "Adaptation  of  FDTD  Techniques  to  Acoustic  Modeling"  (Invited)  J.G.  Maloney  and  K.E.  Cummings 

0920  "FDTD  Investigation  of  the  Antenna-Tissue  Interaction  for  Cellular  and  Y.  Rahmat-Samii  and  M.A.  Jensen 

Satellite  Systems"  (Invited) 

0940  "FDTD  Modeling  of  Ground-Penetrating  Radar  Antennas"  B.J.  Zook 

1 000  BREAK 

1020  "FDTD  Modeling  of  Ultrashort  Optical  Pulse  Interactions  with  R.W.  Ziolkowski 

Nonresonant  and  Resonant  Materials  and  Structures"  (Invited) 

1040  "Time  Domain  Analysis  of  Electromagnetic  Wave  Propagation  in  Nonlinear  G.  Miano,  C.  Serpico, 

Dielectric  Slab"  L.  Verolino  and  F.  Villone 

1 100  "An  Efficient  Sub-gridding  Algorithm  for  FDTD."  D.T.  Shimizu,  M.  Okoniewski  and  M.M.  Stuchly 

1 120  "Using  the  Integral  Forms  of  Maxwell's  Equations  to  Modify  and  Improve  M.F.  Hadi  and  M.  Piket-May 

the  FDTD  (2,4)  Scheme" 

1 140  "From  the  Berenger  PML  ABC  to  Micro-Lasers:  Recent  Advances  in  FD-TD  A.  Taflove 

Modeling  Techniques"  (Invited) 

LUNCH 

SESSION  14:  PROPAGATION  (parallel  with  Sessions  13  and  15)  361  Ingersoll  Hall 

ORGANIZER:  K.  CHAMBERLIN 

0840  "Terrain  and  Refractivity  Effects  in  a  Coastal  Environment;  A.  Barrios 

Results  from  the  VOCAR  Experiment" 

0900  "Capabilities  and  Limitations  Associated  With  Using  GTD  to  Model  K.  Chamberlin 

Propagation  Path  Loss  in  the  Presence  of  Irregular  Terrain" 

0920  "Comparison  of  Electromagnetic  Wave  Propagation  Computer  S.A.  Fast  and  T.  H,  Koschmieder 

Programs" 

0940  "A  Model  for  Estimating  Electromagnetic  Wave  Attenuation  C.  Welch,  C.  Lemak,  and  L.  Corrington 

in  a  Forest  (EWAF)  Environment" 

1 000  BREAK 

1020  "Validation  of  the  Radio  Physical  Optics  Propagation  Model” 

xxiii 


R.A.  Paulus 


THURSDAY  MORNING  23  MARCH 


SESSION  14:  PROPAGATION  (parallel  with  Sessions  13  and  1  5)  (CONT) 

1040  "VTRPE:  A  Variable  Terrain  Electromagnetic  Parabolic 
Equation  Model" 

1 100  "Estimating  Tropospheric  Refractivity  Fields  Using  a  Nonlinear 
Gauss-Markov  Procedure  and  the  PE  Mode!" 


361  Ingersoll  Hall 


D.  Boyer  and  F.J.  Ryan 


1 120  "Modeling  of  Radio  Wave  Ducting  Over  Regular  Boundary" 


SESSION  15:  PARALLELIZATION  OF  EM  CODES  (parallel  with  Sessions  13  and  14) 
ORGANIZERS:  J.  VOLAKIS  AND  A.  CHATTERJEE 

0840  "Advances  in  Time-Domain  CEM  Using 

Massively  Parallel  Architectures"  (Invited) 

0900  "Parallel  Solutions  of  Maxwell's  Equations  on  the  Meiko  CS-2''  (Invited) 


122  Ingersoll  Hall 


C,  Rowell,  V.  Shankar,  W.F.  Hall 
and  A.  Mohammadian 


N.  Madsen,  B.  Erne,  D.  Steich 
and  G.  Cook 


0920  "Parallelization  of  the  CARLOS-3D  Method  of  Moments  Code"  (Invited) 


J.M.  Putnam,  D.D.  Ca 
and  J.D  Kotulski 


0940  "Parallel  Computing  for  Electromagnetism  at  ONERA”  (invited) 
1 000  BREAK 

1020  "The  Performance  of  the  Parallel  Solution  of  the  Quasi-Minimal 
Residual  (QMR)  Method  on  2D  Mesh  Architectures"  (Invited) 


A.  de  La  Bourdonnaye,  A.  Cosnuau, 
X.  Ferriferes,  P.  Leca  and  F.  X.  Roux 


L.  Hamandi,  F.  Ozguner  and  R.  Lee 


1040  "Advanced  Parallel  Solver  Techniques"  (Invited) 

1100  "Parallelized  FDTD  for  Antenna  Radiation  Pattern  Calculations" 


Z.M.  Liu,  A.S.  Mohan,  T.  Aubrey  and  W.R. 
Belcher 


1120  "Calculation  of  Electromagnetic  Fields  with  the  Multiple  Multipole 
Method  (MMP  Method)  on  Parallel  Computers” 

1 140  "Implementation  of  the  Finite-difference  Time-Domain  Method  on  Parallel 
Computers" 


C.  Tudziers  and  H.  Singer 


R.S.  David  and  LT.  Wille 


THURSDAY  AFTERNOON  23  MARCH 

SESSION  16:  EM  THEORY  II  (parallel  with  Sessions  1 8  and  20) 

CHAIRS:  K.  YEE,  R.  GORDON 

1320  "FDTD  Investigation  of  the  Ability  to  Increase  Electromagnetic 
Fields  Around  Head  Tumors" 

1340  "FDTD  and  PMM  Based  Design  of  a  TEM  Horn  Antenna  with  Reduced 
Off-Boresight  Fields" 

1400  "Determination  of  the  Complex  Aperture  Distribution  of  a  Planar 
Spiral  Antenna  from  3D  Far-Field  Radiation  Pattern  Data" 

1420  "Analysis  of  Micro-Contamination  of  Silicon  Wafers  Based 
on  Discrete  Sources  Method  (DSM)" 

1440  "Analysis  of  Convergence  Properties  of  Projection 
Methods  for  Solving  CEM  Applications" 


361  Ingersoll  Hall 


D.B.  Dunn,  AJ.  Terzuoli,  Jr. 

G.C.  Gerace  and  C.A.  Rappaport 


D.J.  Wolstenholme, 

A.J.  Terzuoli,  Jr.  and  G.  C.  Gerace 


M.  Kluskens,  W.  Lippincott 
and  M.  Kragalott 


Y.A.  Eremin  and  N.W.  Orlov 

V.l.  Ivakhnenko,  A.V.  Kukk,  E.E. 
Tyrtyshnikov,  A.Y.  Yeremin 
and  N.L.  Zamarashkin 


XXIV 


'  THURSI 

DAY  AFTERNOON  23  MARCH 

SESSION  17:  ELECTROMAGNETIC  MODELING  TECHNIQUES  FOR  INTEGRATED  OPTICS  (parallel  with  Sessions  18,  19  and  20) 

ORGANIZER:  A.  CANGELLARIS  361  Ingersoll  Hall 

1520 

"Analysis  and  Design  of  Guided-wave  Optical  Devices  Using 
Finite-Difference  Time-Domain  Method"  (Invited) 

S.l.  Chaudhuri  and  S.T.  Chu 

1540 

"Vectorial  Analysis  of  Optical  Waveguides  by  the  Method  of 

Lines"  (Invited) 

R.  Pregla  and  W.  Pascher 

1600 

"Vector  Finite  Element  Analysis  of  Lossless  and  Lossy 

Dielectric  Waveguides"  (Invited) 

P.  Cheung  and  A.  Gopinath 

1620 

“NL-FDTD  Modeling  of  Linear  and  Nonlinear  Corrugated 

Waveguiding  Systems  for  Integrated  Optics  Applications”  (Invited) 

R.W.  Ziolkowski  and  J.B.  Judkins 

1640 

"Analysis  of  Coupled  Nonlinear  Optical  Waveguides  by  Matrix  Method" 

V.  Tripathi,  A.  Weisshaar  and  H.S.  Chang 

SESSION  18:  TOPICS  IN  FRACTAL  AND  WAVELET  ELECTRODYNAMICS  (parallel  with  Sessions  16,  17and  20)  102  Glasgow  Hall 

ORGANIZERS:  D.H.  WERNER  AND  P.L.  WERNER 

1320 

"An  Overview  of  Fractal  Electrodynamics  Research"  (Invited) 

D.H.  Werner 

1340 

"Fractal  Arrays  and  Fractal  Radiation  Patterns"  (Invited) 

P.L.  Werner,  D.H,  Werner  and  A.J.  Ferraro 

1400 

"Wavelet  Transforms  and  Time/Time-scale  Analysis”  (Invited) 

R.K.  Young  and  T.G.  Golsberry 

1420 

"Wavelet-based  Processing  to  Efficiently  Achieve  Broadband 

Monostatic  and/or  Passive  Cross-sensor  Processing"  (Invited) 

R.K.  Young  and  L.H.  Sibut 

1440 

"The  Intervallic  Wavelets  with  Application  in  the  Surface  Integral 

Equations"  (Invited) 

G.W.  Pan  and  J.Y.  Du 

1500 

BREAK 

1520 

"Radar  Cross  Section  Data  Reduction  Using  Wavelets" 

A.S.  Ali,  S.E.  Duval  and  R.L.  Haupt 

SESSION  19:  NEC  APPLICATIONS  (parallel  with  Session  17  and  20) 

ORGANIZER:  J.  BREAKALL 

1 540  "Computationally  Efficient  and  Accurate 

Approximations  for  impedance  Matrix  Elements  of 

NEC-Type  Method  of  Moments  Formulations" 

102  Glasgow  Hall 

D.H.  Werner,  S.E.  Metker 
and  J.A.  Huffman 

1600 

"Development  of  the  Coupled-Resonator  Antenna  Principle 

A  Computer  Modeling  Case  History" 

G.  Breed 

1620 

"Antenna  Design  &  Development  Using  NEC-WIN" 

T.  A.  Erdley,  J.  J.  Shapiro, 

J,  S.  Young  and  J.K.  Breakall 

1640 

"The  "Paint"  System  A  UTD/NEC  Hybrid  Package  for  Simulating  Antenna 
Patterns  Over  3-Dimensionai  Irregular  Terrain" 

J.S.  Young  and  J.K.  Breakall 

SESSION  20:  FEM  (parallel  with  Sessions  16,  17,  18,  and  19) 

CHAIRS:  R.  BURKHOLDER,  J.  KARTY 

122  Ingersoll  Hail 

1320 

"Numerically  Characterizing  Electromagnetic  Fields  Local  to  the  Edge  of 
a  Conducting  Strip  Using  a  Matched  Asymptotic  Technique  and  the  Finite 
Element  Method" 

A.S.  Ali  and  C.  L.  Holloway 

1340 

"An  Enhanced  "A  Posteriori"  Remeshing  Algorithm  for  Adaptive  Meshing 
of  2D  Finite  Element  Problems" 

P,  Girdinio,  A.  Manella  and 

G.  Molinari 

1400 

"Finite  Element  Analysis  of  Waveguides  Using  Edge-Based  Magnetic 

Vector  Potential  and  Nodal-Based  Electric  Scalar  Potential" 

J.  'F.  Lee,  G.  Lizalek  and  J.  Brauer 

1420 

"A  Scattering  Analysis  of  Laser  Beam  Wave  by  Groove  Pits  on  Optical 
Memory  Disk  by  Using  FEM  with  BEM" 

Y.  Miyazaki  and  K.  Tanaka 

1440 

"3D  Nodal-  and  Mixed-Based  Elements  for  Unbounded 

Microwave  Problems" 

A.  Nicolas,  L.  Nicolas  and  J.L.  Yao-bi 

1500 

BREAK 

XXV 

THURSDAY  AFTERNOON  23  MARCH 

SESSION  20;  FEM  (parallel  with  Sessions  16,  17,  18,  and  19)  (CONT) 

1  520  "A  Rationale  for  the  Use  of  Mixed-order  Basis  Functions  Within  Finite 
Element  Solutions  of  the  Vector  Helmholtz  Equation" 

1  540  "Finite  Element  Waveguide  Simulator  Techniques" 

1600  "A  Solution  for  Open  Boundary  Electromagnetic  Field  Problems  by 
Mapped  Infinite  and  Virtual  Elements" 

FRIDAY  (MORNING.  MARCH  24 


122  Ingersoll  Hall 


A.F.  Peterson  and  D.R.  Wilton 


J.R.  Sanford  and  N.M.  Johansson 


L.H.A.  de  Medeiros  and  A.  Raizer 


0730  CONTINENTAL  BREAKFAST 

SESSION  21:  EM  ANALYSIS  TECHNIQUES  FOR  ELECTRICALLY  LARGE  CAVITIES  (parallel  with  Sessions  22  and  23) 
ORGANIZER:  D.  PFLUG  122  Ingersoll  Hall 

0840  "Application  of  Modal  and  Plane  Wave  Expansions  to  Modeling  Large  J.  L.  Karty  and  J.M.  Roedder 


0840  "Application  of  Modal  and  Plane  Wave  Expansions  to  Modeling  Large 
Jet  Engine  Cavities"  (Invited) 

0900  "Scattering  from  Dielectric  Loaded  Cavities  Using 
Shooting  and  Bouncing  Rays"  (Invited) 

0920  "Xpatch  Simulation  of  Large  Inlet  Structures"  (Invited) 

0940  "An  Iterative  Physical  Optics  Approach  for  the  Em  Analysis  of 
Cavities  and  Other  Multi-Bounce  Geometries"  (Invited) 


M.  Christensen,  S.  W.  Lee 
D.J.  Andersh 


R.  Bhalla  and  H,  Ling 


R.J.  Burkholder 


1000  BREAK 

1020  "Improved  Ray  Basis  in  the  Hybrid  Analysis  of  EM  Scattering 
by  Large  Open  Cavities"  (Invited) 


R.J.  Burkholder,  P.H.  Pathak, 

H.T.  Chou,  D.  Andersh  and  J,  Path 


1040  "Overlapping  Modal  and  Geometric  Symmetries  for  Computing  Jet  Engine  D.C.  Ross,  J.L.  Volakis, 

Engine  Inlet  Scattering"  (Invited)  H.  T.  Anastassiu  and  D.  Andersh 


SESSION  22:  ACCURACY  ESTIMATION  IN  ELECTROIVIAGNETIC  MODELING  (parallel  with  Sessions  21  and  23)  109  Glasgow  Hall 
ORGANIZER;  S.M.  WANDZURA 

0840  "Assessing  the  Influence  of  Coefficient  Accuracy,  Matrix  Condition  E.  K.  Miller 

Number,  Size  and  Type,  and  Computer  Precision  on  Matrix-Solution 
Accuracy"  (Invited) 


0900  "Numerical  Accuracy  Issues  in  Finite  Element  Frequency  Domain 
Solutions  of  Radar  Scattering  Problems"  (Invited) 

0920  "Accuracy  in  Computation  of  Matrix  Elements  of  Singular  Kernels" 

0940  "Accuracy  Estimation  and  High  Order  Methods" 


S.M.  Wandzura 

L. R.  Hamilton,  J.J.  Ottusch, 

M. A.  Stalzer,  R.S.  Turley, 

J.L.  Visher  and  S.  M.  Wandzura 


1000  BREAK 

1020  "Accuracy  Issues  in  Time-Domain  CEM  Using  Structured/Unstructured 
Formulations”  (Invited) 

1040  "An  Accuracy  Study  for  the  3D  Hybrid  Finite  Element  Method  of 
Moments  SWITCH  Code"  (Invited) 

1100  "Modeling  Accuracy  of  3D  Method  of  Moments  Techniques" 


V.  Shankar,  W.  F.  Hall 
and  S.  Palaniswamy 


G.E.  Antilla  and  Y.C.  Ma 


M.B.  Gedera,  L.N.  Medgyesi-Mitschang,  R.A. 
Pearlman,  J.M.  Putnam,  D-S.Y.  Wang 


1 1  20  "Requiring  Quantitative  Accuracy  Statements  in  EM  Data"  (Invited) 
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SESSION  23:  PDE  METHODS  IN  ELECTROMAGNETICS  (parallel  with  Sessions  21  and  22)  102  Glasgow  Hall 
ORGANIZERS:  R.  LEE  AND  J.  -F.  LEE 


0840  "Optimization  Issues  in  Finite  Element  Codes  for  Solving  Open  Domain  A.  Chatterjee  and  J.L.  Volakis 

3D  Electromagnetic  Probiems"  (Invited) 

0900  "A  Characteristic-Based  3D  Time  Domain  Maxwell  Equation  Solver"  J.S.  Shang  and  K.C.  Hill 

(Invited) 

0920  "Finite  Element  Solution  of  Eddy  Current  Problems  in  Electromagnetics"  O.A.  Mohammed  and  G.  F.  Oier 

(Invited) 

0940  "Ten  Years  of  Evolution  of  the  FDTD-like  Conformal  Technique"  (Invited)  K.S.  Yee 

1000  BREAK 

1020  "Whitney  Elements  Time  Domain  (WETD)  Methods  for  Solving  Three 
Dimensional  Waveguide  Discontinuities"  (Invited) 

1040  "An  FDTD/FVTD  2D-algorithm  to  Solve  Maxwell's  Equations" 

1 1 00  "Spectral  Finite  Methods  for  the  Simulation  of  Electromagnetic 
Interactions  with  Electrically  Long  Structures"  (Invited) 

LUNCH 


J.  -F.  Lee 

J.S.  Chen,  J-V.  Prodan  and  K.S.  Yee 
A.C.  Cangellaris  and  D.  Hart 


SATURDAY.  25  MARCH  SHORT  COURSES 

0830-1 630  SHORT  COURSE  (FULL-DAY)  101A  Spanagel  Hall 

"Using  Mathematical  Software  for  Computational  Electromagnetics"  Jovan  Lebaric,  Naval  Postgraduate  School 


0830-1630  SHORT  COURSE  (FULL-DAY) 

"Wire  Antenna  Modeling  Using  NEC" 


109  Glasgow  Hall 

Dick  Adler,  Naval  Postgraduate  School 

Jim  Breakall,  Penn  State  University 

Gerry  Burke,  Lawrence  Livermore  National  Lab 


0830-1 630  SHORT  COURSE  (FULL-DAY)  102  Glasgow  Hall 

"FDTD,  Generalized  FDTD  and  FVTD  Techniques  in  Solving  Maxwell's  Kane  Yee,  Lockheed 
Equations" 


SESSION  II: 

MICROWAVE  AND  GUIDED  WAVE 


Chairs:  P.  Goggans,  A.  Terzuoli 
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COMPUTER-SIMULATION  OF  ISOTROPIC,  TWO-DIMENSIONAL 
GUIDED-WAVE  PROPAGATION 


Ross  A.  Specials 
Redondo  Beach,  California 


1  -  PRINCIPLES  OF  THE  SIMULATION  METHOD. 

A  relatively  simple  computer-simulation  method  has  been  developed,  for 
performing  a  first-approximation  analysis  of  various  two-dimensional  guided-wave 
propagation  processes.  The  method  has  been  applied  in  two  MATLAB  programs. 

One  of  the  programs  synthesizes  a  phase-conjugated  EM-field  that  approximates, 
with  its  amplitude-distribution,  the  shape  of  a  two-dimensional  Dirac  impulse-function. 
The  second  program  synthesizes  any  given  two-dimensional,  complex  aperture 
distribution,  in  terms  of  local  amplitude  and  phase. 

The  developed  simulation  method  uses  (as  an  Initial  and  qualitative 
first-approximation)  a  square,  two-dimensional  wave-guiding  medium  (such  as  the  well 
known  parallel-plate  waveguide  !),  a  medium  characterized  by:  a)  total  uniformity  in 
any  arbitrary  azimuthal  direction,  b)  mathematical  linearity,  reciprocity,  and 
losslessness,  and  c)  complete  isotropy,  as  defined  by  an  azimuth-independent 
propagation  constant. 

The  wave-propagation  properties  of  the  medium  are  thus  characterized  by  total 
absence  of  internal  scattering,  of  any  wave-attenuation  due  to  ohmic  or  radiation 
losses,  and  by  a  known  azimuth-independent  phase-constant.  The  developed  method 
can  be,  however,  easily  modified  to  account  for  any  known  type  of  anisotropy  of  the 
wave-propagation  medium,  if  such  anisotropy  can  be  defined  by  some  functional 
azimuth-dependence  of  the  propagation-constant.  Further,  any  analytically-defined 
loss-mechanism  can  also  be  easily  accounted  for. 

A  large  number  of  different,  two-dimensional  guided-wave  field-patterns  of 
increasing  complexity  have  already  been  analyzed  and  synthesized.  All  the 
two-dimensional  wave-field  patterns  generated  so  far,  have  been  obtained  as  weighted 
linear  combinations  of  the  individual,  two-dimensional  wave-field  patterns  of  targe 
numbers  of  mutually-coherent  sources.  The  sources  were  mostly  aligned  along  the 
outer  perimeter  of  the  square  wave-propagation  domain,  although  the  geometric 
pattern  of  the  source  locations  is  quite  arbitrary. 

Many  examples  of  sharp  phase-conjugation  focusing  in  two-dimensions  have  been 
synthesized,  using  either  two  or  four  sets  of  external  sources,  ail  aligned  along  two 
or  all  four  outer  sides  of  the  square  domain.  The  sources  are  assumed,  as  a  first 
approximation,  to  be  all  unconditionally-matched  to  the  four  sides  of  the 
two-dimensional  wave-guiding  medium,  to  be  somehow  mutually  isolated,  and  to  have 
totally  different,  mutually  independent,  and  arbitrary  amplitudes  and  phases. 

The  sides  of  the  square  wave-field  domain,  where  no  external  sources  are 
connected  (If  any),  are  assumed  to  be  also  unconditionally  matched  to  an  external, 
reflection  I  ess,  dissipative  medium,  that  totally  absorbs  any  incident  wave-field. 
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An  overwhelmingly  large  number  of  different,  two  dimensional  wave-field 
patterns  can  be  obtained,  simply  by  independently  controlling  the  relative  amplitudes 
and  the  relative  phases  of  all  the  mutually-coherent  external  sources. 

A  large  number  of  two-dimensional  wave-patterns  have  already  been  generated, 
by  using  the  developed  simulation  method,  including  many  that  are  very  sharply 
focused  to  very  small  and  arbitrarily-located  spots  on  the  square  wave-propagation 
domain,  thus  forming  there  narrow  amplitude-peaks,  resembling  very  sharply-focused, 
two-dimensional  wave-caustics. 

The  developed  simulation  method  provides  very  clear  evidence  of  the  possibility 
of  arbitrarily  reshaping  the  complex-amplitude  distribution  of  a  two-dimensional 
wave-field  pattern,  by  simply  controlling  only  the  relative  amplitudes  and  the  relative 
phases  of  external  sources,  aligned  around  the  perimeter  of  the  considered 
wave-propagation  domain.  The  geometrical  shape  of  the  domain  perimeter,  where  the 
external  sources  are  assumed  to  be  connected,  is  obviously  quite  arbitrary,  and  it  was 
chosen  square  just  for  expediency.  It  could,  however  be  rectangular,  hexagonal, 
circular,  elliptic  or  any  other  desired  shape.  Further,  the  wave-propagation  domain 
need  not  even  be  planar,  but  could  for  instance  conform  to  a  closed  circular  cylinder, 
thus  leaving  only  the  two  opposing  rims  free  for  connecting  external  sources. 

In  this  case,  obviously,  multi-path  propagation  of  waves,  circulating  (clockwise 
or  counter-clockwise)  around  the  cylinder  axis  one  or  more  times,  should  be  taken 
into  account,  unless  the  curved  propagation  medium  is  assumed  to  be  very  lossy. 

The  wave-field  domain  could  also  conform  to  an  open-truncated,  spherical  dome, 
with  only  one  rim  being  accessible  to  external  excitation  sources. 

The  developed  wave-field  pattern  simulation  method  is  presently  being  used  as 
a  too!  for  establishing  a  deterministic  wave-field  synthesis  procedure,  that 
approximates  any  required  wave-field  distribution  in  a  least-squares  sense. 

Recently  generated  preliminary  results  clearly  show  the  applicability  and 
powerful  effectiveness  of  this  least-squares  synthesis  procedure.  Indeed,  a  Hansen’s 
’one-parameter’  aperture-distribution  for  circular  apertures,  with  an  W-parameter 
value  of  1.7254  (corresponding  to  -40  dB  first-si  delobe  level),  was  approximated,  in 
a  least-squares  sense,  with  an  RMS  residual  deviation  of  only  a  few  parts  in  10  (for 
a  broadside,  and  for  an  off-broadside  beam-steering  direction). 

2  -  representation  of  the  TWO-DIMENSIONAL  MEDIUM. 

The  mathematically  linear,  reciprocal,  uniform,  isotropic,  and  lossless  two- 
dimensional  wave-guiding  medium,  that  is  the  domain  of  the  wave-propagation  process, 
is  analytically  represented  by  a  large-order,  complex,  square  matrix  M.  The  complex 
value  of  each  matrix-element  is  used  to  represent  the  local  complex  field  amplitude  of 
any  given  wave-field  pattern.  The  row-  and  column-indexes  ’  /  and  ’  j  ’  of  each 
matrix  element  are  used  to  represent  the  discretized  physical  location  of  an  arbitrary 
point  in  the  considered  square  domain  of  wave-propagation. 

The  square-matrix  representation  of  a  uniform  and  isotropic,  physical  two- 
dimensional  wave-guiding  medium  is  thus  implicitly  used  as  either  a  discretized 
representation  of  a  continuous,  uniform,  parallel-plane  TEM-wavegutde,  or  as  a 
discretized  representation  of  a  continuous  radiating  aperture.  In  this  analytical 
representation,  each  matrix  element  corresponds  to  anode  of  a  discretized,  geometrical 
square  lattice,  superimposed  on  the  physical  wave-field  domain. 

The  order  of  the  square  matrix  M,  used  in  the  performed  simulations  to 
represent  the  wave-field  domain,  was  initially  chosen  to  be  129  x  129,  with  the  (65,65) 
point  representing  the  domain’s  central  point. 
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3  -  SIMULATION  OF  THE  FIELDS  OF  THE  EDGE-FEEDING  SOURCES. 


All  the  analyzed  two-dimensional  wave-field  patterns  are  generated  by  weighted 
linear  superposition  of  different  large  sets  of  mutually -coherent,  component  wave- 
fields,  each  component  being  generated  by  one  of  the  external  edge-feeding  sources, 
in  the  form  of  a  single  azimuthal  I  y-symmetric  cylindrical  wave. 

The  radial  dependence  of  each  injected  cylindrical  wave  is  analytically 
represented  by  the  zero-order  cylindrical  Hankel  function  of  the  second  kind  : 


r)  =  Jo(h  r)  -  i  Yoikn  r)  (  »  =  /^  ) 


where  the  symbol  ’  r  ’  represents  the  radius-vector  distance  of  any  given  field-point 
(  /  ,  y  )  of  the  wave-propagation  domain  from  the  location  (  /j  ,  jj  )  of  any  given 
edge-feeding  source: 

r  =  ^  a  -  isf  +  (/  -  // 

and  where  kn  is  the  cylindrical  wave-number,  representing  the  asymptotic  value  of 
the  phase-rotation  of  the  wave-field  per  ’lattice-unit’  (defined  as  the  node-to-node 
spacing  through  the  discretized  square  lattice,  along  Its  rows  and  columns). 

The  azimuthal  dependence  of  a  cylindrical  wave,  of  any  arbitrary  type,  is  known 
to  be  expressed  by  where  m  is  the  well-known  cylindrical  symmetry  index.  For 

azimuthally-symmetric  cylindrical  waves,  however,  m  =  0,  so  that  the  azimuthal 
dependence  function  reduces  to  *  ^  =  1. 

The  locations  (  /j  ,  )  of  all  the  external  edge-feeding  sources  are  aligned  and 

regularly  spaced  along  the  first  and  the  last  rows  (  /j  =  1,  yj  =  1,..,129  and  /j  =  129, 
j.  =  1,..,129  ),  and  aligned  along  the  first  and  the  last  columns  (  ^  “  1,— ,129,  J5  =  1 
^d  /^  =  1,...,129  ,  =  129  )  of  the  129  x  129  complex  matrix  M  ,  used  to  represent 

the  resulting  total  wave-field  pattern. 

Consistently  with  the  assumptions  of  losslessness  and  total  isotropy,  the 
medium  is  characterized  by  an  azimuth-independent  phase-constant,  expressed  In 
degrees  of  phase-rotation  between  adjacent  matrix  elements,  or  geometrically  between 
the  corresponding  adjacent  nodes  of  the  discretized,  square  lattice.  These 
assumptions  lead  to  the  choice  of  a  constant,  azimuth-independent  value  for  the 
cylindrical  wave-number  expressed,  in  radian  per  lattice-unit,  by  k^  =  2  n  / 
(where  Xn  is  the  wavelength  of  an  infinite,  two-dimensional  ’planar’  wave,  propagating 
with  a  straight  wavefront  through  the  medium  of  the  wave-propagation  domain). 

4  -  FIELD  SYMMETRY  AND  COMPUTATIONAL  EFFICIENCY. 

A  highly  efficient  computation  procedure  is  made  possible  by  the  intrinsic 
symmetries  of  the  square  wave-field  domain,  and  of  the  regularly-spaced  locations  of 
the  129  external  sources  aligned  along  each  of  the  four  sides.  Indeed,  only  the  single 
component  wave-field  of  any  of  the  four  corner-sources  needs  to  be  computed,  while 
all  the  remaining  component  wave-fields  are  obtained  by  way  of  symmetry-operations, 
and  wave-field-spllcing  operations. 
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The  wave-field  patterns  of  the  four  corner-sources  A,  B,  C,  and  £>,  located  at 
points  (1,1),  (129,1),  (1,129),  and  (129,129),  are  generated  first,  with  the  complex 
amplitudes  of  all  four  corner-sources  set  to  unity  (and  zero  phase).  These  four  basic 
wave-field  patterns  are  stored  in  four  complex  129  x  129  auxiliary  matrices.  Only  the 
wave-field  pattern  of  the  source  A  is  actually  computed,  by  repeatedly  evaluating  the 
expression  (1)  through  the  129  x  129  matrix  A  ,  with  =  1  . 

The  wave-field  pattern  of  source  B  is  then  generated  by  a  reflection-symmetry 
operation,  around  the  row  /  =  65,  performed  on  the  matrix  A  .  The  reflection  is 
simply  obtained  by  reversing  the  row-sequence  of  the  matrix  A  into  the  matrix  B  . 
Similarly,  the  wave-field  pattern  of  the  source  C  is  obtained  by  reversing  the  column- 
sequence  of  the  matrix  A  ,  into  the  auxiliary  matrix  C  .  Finally  the  wave-field  pattern 
of  the  source  D  is  obtained  by  reversing  the  column-sequence  of  the  matrix  B  ,  into 
the  auxiliary  matrix  D  . 

The  fundamental  wave-fields  of  the  four  corner-sources  are  then  used  to  build¬ 
up  the  copmponent  wave-fields  of  all  the  other  sources,  by  performing  127  matrix- 
splicing  operations  on  each  of  the  four  sides  of  the  square  domain.  For  Instance,  the 
wave-field  of  a  source  E  ,  located  at  point  =  i^(E)  ,  =  1  along  the  A  ->  B  side  of 

the  square  domain,  is  obtained  by  copying  the  set  of  rows  1  through  130  -  ig(E)  ,  to 
the  rows  i^(E)  through  129  of  an  auxiliary  129  x  129  complex  matrix  T  ,  and  then 
copying  rows  130  -  ig(E)  through  128  of  matrix  B  to  rows  1  through  f^CE)  -  1  of  the 
same  auxiliary  matrix  T  .  The  obtained  wave-field  of  source  E,  that  is  represented 
In  the  matrix  T  with  unity  source-amplitude  and  zero  phase,  is  then  multiplied  by  the 
specified  complex  weight  W(E)  ,  and  added  to  the  complex  129  x  129  matrix  M  ,  where 
the  total  wave-field  pattern  is  progressively  accumulated,  as  a  weighted  linear 
superposition  of  all  the  mutual ly-coherent,  component  wave-fields  of  the  516  sources. 

The  matrix-splicing  operation  described  for  the  generic  source  E  ,  arbitrarily 
located  along  the  A  ->  B  side,  is  similarly  repeated  for  sources  F  ^  G  ,  and  H  , 
arbitrarily  located  along  the  other  three  sides.  The  auxiliary  matrix  T  Is  repeatedly 
re-used  for  temporary  storage,  prior  to  multiplication  of  the  obtained  spliced  wave- 
field  by  the  corresponding  complex  weight,  and  final  accumulation  in  the  matrix  M  . 

5  -  TWO-DIMENSIONAL  PHASE-CONJUGATION  FOCUSING. 

Phase-conjugated  focusing  of  the  total  wave-field  pattern  represented  in  the 
129  x  129  complex  matrix  M  ,  to  an  arbitrarily  selected  point  of  the  square  domain,  is 
obtained  by  appropriately  determining  the  required  complex  weights  of  all  the  sources 
aligned  along  each  of  the  four  sides.  All  the  complex  weights  are  generally  different, 
unless  the  focus  location  Is  set  to  the  domain  central  point  (65,65). 

To  obtain  a  phase-focused  pattern,  the  complex  weight  of  each  source  Is  set  to 
the  value  required  to  make  the  local  amplitude  of  the  corresponding  component  wave- 
field  equal  to  unity  (with  zero  phase),  at  the  selected  focus  location  (  ,  Jp  )•  The 

required  weight  value  is  the  reciprocal  of  the  local  complex  amplitude  of  the 
component  wave-field  built-up  in  the  auxiliary  matrix  T  with  unity  source  amplitude, 
at  the  selected  focus  location  point  (  if  t  Jp  )• 

The  required  magnitude  of  all  the  complex  source  weights  is  always  larger  than 
unity,  as  required  to  compensate  for  the  radial  amplitude  decay  of  an  m  =  0 
cylindrical  wave,  which  asymptotically  approximates  a  1/r  law.  Similarly,  the  argument 
of  all  the  complex  source  weights  always  represents  a  phase-lead,  relative  to  the 
unweighted  wave-field  component,  by  an  amount  that  continuously  Increases  with 
increasing  distance  ’  r  ’  ,  measured  from  the  source  location  (  /j  ,  )  to  the  selected 

focus-point  (  if  ,  Jp  ).  The  complex  weights  of  the  four  sources  aligned  on  the  same 
row  if  ,  and  on  the  same  column  jp  of  the  focus-point  have  minimum  magnitudes,  and 
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minimum  phase-conjugated  leads,  because  of  being  located  at  the  minimum  distances 
’  r  ’  on  the  respective  sides  of  the  wave-field  domain. 

Figures  1,  and  2  show  two  typical  examples  of  two-dimensional,  phase- 
conjugated  focusing,  in  the  form  of  3D  amplitude  displays.  The  sharpness  of 
amplitude-peak  has  been  found  to  be  largely  independent  from  the  selected  specific 
location  of  the  focus-point. 

6  -  LEAST  SQUARES  FITTING  OF  PRESCRIBED  APERTURE  DISTRIBUTIONS. 

The  developed  wave-pattern  simulation  method  Is  presently  being  used  as  a  tool 
for  evaluating  a  deterministic  aperture-distribution  synthesis  procedure,  capable  of 
approximating  any  required  aperture  distribution  in  a  least-squares  sense. 

Recently  generated  preliminary  results  clearly  show  the  applicability  and 
powerful  effectiveness  of  this  least-squares  synthesis  procedure. 

A  Hansen’s  ’one-parameter’  aperture-distribution  for  circular  apertures,  with 
an  AV^parameter  value  of  1.7254  (corresponding  to  -  40  dB  first-sidelobe  level),  is 
approximated.  In  a  least-squares  sense,  with  an  RMS  residual  deviation  of  only  a  few 
parts  in  10"^  (for  both  a  broadside  and  an  off-broadside  beam-steering  direction). 

The  ultimate  objective  of  this  effort  is  to  determine  the  feasibility  of  edge¬ 
feeding  a  two-dimensional,  high-directivity,  low-sidelobe,  electronically-steered  phased 
array,  with  a  substantially  reduced  number  of  amplitude-,  and  phase-controlled 
microwave  sources.  The  radiating  elements  of  such  a  ’clustered’  phased  array  [1,2] 
are  fed  through  the  nodes  of  an  electrical ly-large,  underlaying,  two-dimensional, 
periodic  slow-wave  delay-structure  [3-7],  approximating  in  discretized  form  a  uniform 
and  isotropic  wave-propagation  medium. 

Figures  3,  and  4  show,  as  a  typical  result,  the  least-squares  fitting  of  a 
separable  Hansen  one-parameter  distribution,  defined  on  a  61  x  61  square-lattice 
aperture  by: 


g[p,,Py)  -  “ 

where  Pj,  =  2  n  x  /  U  p  =  2  n  y  /  L  (with  L  =  domain  side  length),  and  H  =  0.8899 

(for  a  -  25  dB  si  delobe  level). 
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Fipre  2  -  Two-Dimensional,  Phase-Conjugated  Focusing  of  the  Wave-Field  Pattern 
of  516  Mutually-Coherent  Edge-Feeding  Sources :  Focus  at  Point  (65,65), 
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ABSTRACT 

This  paper  presents  a  rigorous  full-wave  investigation  of  the  propagation  characteristics  of  very  short  electric 
pulses  on  uniform  and  tapered  printed  transmission  lines  by  means  of  the  Spectral  Domain  Approach  (SDA).  Initially,  a 
frequency’  domain  analysis  is  carried  out  to  determine  the  variation  of  the  propagation  constant  and  characteristic 
impedance  over  a  sufficiently  wide  frequency  band.  In  the  case  of  a  tapered  transmission  line,  it  is  also  necessary  to 
determine  the  variation  of  these  parameters  along  the  length  of  the  line.  Next,  a  time  domain  analysis  is  performed,  using 
the  Fourier  Transform,  to  simulate  the  propagation  of  the  pulse.  Although  different  planar  transmission  line 
configurations  were  considered,  results  are  presented  here  for  microstrip  lines,  and  for  the  combined  CPW-slot  line, 
which  has  a  tjpical  CPW  configuration  on  one  side  of  the  substrate,  and  that  of  a  slotline  on  the  other  side.  Comparison  is 
made  between  simulated  and  ex-perimental  results,  with  excellent  agreement. 

1.- INTRODUCTION 

It  is  now  possible  to  generate  and  manipulate  pulses  with  pico-  and  sub-pico-second  duration,  due  to  the 
significant  progress  achieved  by  high  speed  optoelectronics.  Such  progress  then  represents  a  continues  challenge  in  the 
design  and  fabrication  of  the  microwave  circuits  (MICs,  MMICs,  and  HMICs)  used  in  high  speed  sx'stems,  due  to  the 
requirements  of  increasingly  larger  band-widths.  As  the  transmission  line  structures  used  in  these  circuits,  such  as 
microstrip  lines  and  coplanar  waveguides  (CPW),  do  not  support  TEM  propagation,  the  frequency'  components  of  a  short 
electric  pulse  do  not  propagate  at  the  same  phase  velocity.  In  consequence,  the  pulse  shape  is  distorted  as  it  propagates 
along  the  transmission  line.  A  knowledge  of  the  propagation  characteristics  of  short  electric  pulses  along  dispersive 
transmission  lines  is  necessary  for  the  proper  design  of  these  circuits. 

In  the  analyses  found  in  the  literature'"^,  only  the  frequency  dependency  of  the  propagation  constant  is 
considered,  while  the  characteristic  impedance  is  generally  considered  as  invariant.  In  some  of  the  analysis,  empirical 
formulae  are  used  to  estimate  the  dispersion  characteristics  of  the  transmission  lines.  In  this  paper,  the  frequency' 
dependency  of  both  the  propagation  constant  and  the  characteristic  impedance  is  taken  into  account  w'ith  the  w'ell  know'n 
Spectral  Domain  Approach,  which  allows  full  non-TEM  analysis.  The  temporal  response  of  the  transmission  lines  is 
calculated  with  algorithms  based  on  the  Fast  Fourier  Transform. 

This  paper  then  gives  transfer  functions  for  uniform  and  tapered  microstrip  and  CPW-slot  lines.  An  exponential 
impedance  profile  was  considered  for  the  tapered  transmision  lines.  Results  of  the  distortion  of  Gaussian  and  square 
pulses  are  shown  and  discussed.  Also,  comparison  is  made  with  experimental  results,  with  excellent  agreement. 
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2.-  FORNtULATION 

The  type  of  transmission  lines  considered  here  can  be  treated  as  a  linear,  time  invariant  system,  whose  temporal 
response,  V(t,z),  to  an  input  signal  V(t,z=0)  can  be  written  as^: 

V(t,2=D)  =  T^{3  [V(t,2=0)]-r(Q,z  =  D) }  (1) 

where  3  represents  the  Fourier  Transform,  and  T(co,z=D),  the  transfer  function  of  the  transmission  line. 

In  the  case  of  lossless,  uniform  lines,  this  fiinction  simply  introduces  a  phase  change,  which  is  equal  to  the 
product  of  the  (imaginary)  propagation  constant,  yfco),  by  the  propagating  distance,  z=D.  For  lossy  lines,  an  attenuation 
factor  must  be  added,  as  y((»)  is  then  complex.  In  the  case  of  a  non-uniform,  or  tapered,  transmission  line,  the  transfer 
function  must  incorporate  the  variation  of  the  propagation  constant  and  the  characteristic  impedance  along  the  length  of 
the  line.  For  example,  for  a  tapered  transmission  line,  with  an  exponential  impedance  profile,  as  illustrated  in  Figure  1, 
T(co,z=D)  can  be  written  as': 


r«o.z  =  D) .  [l  -  i[ln(|) .  [-;e(a..z  =  D)J  (2) 

with  e(ia,z  =  D)  =  y(cD,z)i*.  The  impedances  Z,  and  represent  the  initial  and  final  impedances  of  the  transmission, 
line,  respectively. 

The  equation  (2),  derived  from  the  small  reflection  theorj^,  assumes  an  exponential  impedance  profile  for  all 
frequencies;  this  assumption  implies  that  the  impedance  is  firequency  invariant.  Such  approximation  is  only  acceptable  for 
a  veiy'  narrow  frequency  band,  vvich  in  general  is  not  compatible  with  ultra-short  electric  pulses.  A  new  expression  for 
T{co,z)  will  be  derived,  based  on  the  exact  multiple  reflection  theory^.  With  this  new  expression,  which  is  very  adequate 
for  numerical  calculations,  any  impedance  profile  can  be  considered. 


The  non-uniform  transmission  line  illustrated  in  Figure  1  can  be  represented  by  multiple  sections  of  uniform 
lines  with  length  AZ  -  as  shown  in  Figure  2.  Each  section  is  described  by  its  own  characteristic  impedance  and 
propagation  constant,  wdch  can  both  have  any  dependence  with  frequency.  This  approximation  is  exact  in  the  limit,  when 
the  number  of  sections  tends  to  infinity.  The  equivalent  impedance  at  the  input  of  the  i-th  section  is  given  by: 

Z,>,,  =  +;Zo,tan  (|3iAZ))/(Z,,  +yZ,„,,,tan  (p,AZ))]  (3) 

With  equation  (3),  all  the  impedances  Z^  can  be  calculated  recursively  for  the  N  sections.  The  total  reflection 
coefficient  at  the  input  of  the  transmission  line,  F;,,,  is  determined  as: 

r,„  =  (Z,„,-Z,)/(Z,>.,+Z,)  (4) 
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Figure  2:  Multiple  uniform  section  representation  of  a  non-uniform  transmission  line. 


From  the  voltage,  Ty,  and  the  effective  power,  Tp,  transmission  coefficients  are  calculated  as: 


Ty=\+Ti„ 

(5) 

.K 

1 

II 

(6) 

It  is  now  clear  that  the  transfer  function  T(ffl,z)  can  be  obtained  from  (5),  or  (6),  according  to: 
7(co,z  =  D)  =  Tvip  exp[-y0(©,z  =  £))] 

where  6(®, z  =  D)=  S  p/AZ 


(7) 

(8) 


In  the  limit  when  AZ  tends  to  zero,  Riccati's  equation  is  obtained,  which  gives  the  reflection  coefficient  r(z)  at 
any  position  along  the  transmission  line^  The  analytical  solution  of  this  equation  is  only  possible  in  certain  restricted 
cases,  for  specific  impedance  profiles.  For  an  arbitrary  impedance  profile,  numerical  means  of  solutions  are  necessaiy. 

The  formulation  presented  here  simplifies  the  numerical  calculation  of  the  input  reflection  coefficient.  The 
accuracy^  of  the  calculation  is  determined  by  the  length  AZ  chosen  for  the  uniform  sections.  This  formulation  allows  the 
consideration  of  multiple  reflections,  arbitrary'  impedance  profile,  and  takes  into  account  the  fully  dispersive  nature  of  the 
transmission  line.  Equations  (1)  and  (7)  describe  the  algorithm  developed  to  simulate  the  propagation  of  ultra-short  pulses 
in  uniform  and  non-uniform  transmission  lines.  The  firequency'  domain  analysis  of  each  section,  i.e.  the  calculation  of  the 
propagation  constant,  p,(co),  and  of  the  characteristic  impedance,  Z„j(o),  is  performed  with  the  Spectral  Domain  Approach 
(SDA®’’),  while  Fast  Fourier  Transform  is  used  for  the  time  domain  analysis. 


3.- RESULTS 


3.1-  Microstrip  Lines 

First,  the  SDA  was  used  to  determine  the  dispersion  characteristics  of  uniform  and  non-uniform  microstrip  lines. 
Three  lines  were  considered  on  a  0.635  mm  thick  substrate,  with  a  dielectric  constant  e  =10.5.  For  the  uniform  line,  a 
strip  width  of  0.588mm  was  chosen,  so  that  the  line  impedance  w'as  5012.  For  the  non-uniform  lines,  the  strip  width 
varied  continously  from  0.588mm  to  1.9 18mm  in  one  line,  and  from  0.588mm  to  3.300mm  in  the  other  one;  both  lines 
were  25mm  long.  The  corresponding  impedance  transformation  ratios  are  2  and  3,  respectively,  with  exponential 
impedance  profiles.  The  design  frequency  was  10  GHz.  Figures  3  and  4  show,  for  the  second  tapered  line,  the  variation  of 
the  effective  dielectric  constant  and  of  the  characteristic  impedance  with  position  and  with  frequency,  The  dispersive 
nature  of  the  structure  is  represented  by  the  frequency  dependent  characteristic  impedance  and  effective  dielectric 
constant. 
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Next,  pulse  propagation  was  simulated.  In  all  cases,  a  30  ps  (FWHM)  Gaussian  pulse  was  launched  at  z=0.  The 
temporal  response  of  the  three  microstrip  lines  at  z=D=25mm  is  shown  in  Figure  5.  It  was  observed  that  the  transmission 
line  dispersion,  represented  by  the  non-linear  dependence  between  the  effective  dielectric  constant  and  frequency,  was  the 
main  cause  of  pulse  distorsion.  The  variation  of  the  impedance  along  the  length  of  the  lines  does  not  change  the  pulse 
shape  considerably,  only  introduces  a  time  delay,  and  a  small  reduction  in  amplitude,  due  to  mismatching.  These  effects 
are  more  pronounced  for  larger  impedance  transformation  ratios.  It  was  also  observed  that  the  variation  of  impedance 
with  frequency  has  very  little  effect  on  the  pulse  shape,  unless  the  pulse  is  extremely  narrow.  Both  the  frequency  spectrum 
of  the  input  pulse,  represented  by  the  amplitude  of  the  normalized  voltage  signal  [V(f)A^(0)t,  and  the  magnitude  of  the 
total  input  reflection  coefficient,  jri„|,  are  shown  in  Figure  6  for  the  tapered  microstiip  line  with  impedance  transformation 
ratio  Zj/Z,=2.  Two  situations  are  illustrated  in  this  figure:  considering  the  variation  of  the  characteristic  impedance  with 
frequency  (solid  curve),  and  ignoring  it  (dashed  curve).  In  the  latter  case,  the  input  reflection  coefficient  reproduces  that 
of  a  TEM  tapered  transmission  line  with  exponential  impedance  profile,  and  behaves  like  the  fimction  jsen(f)/fj,  which 
decays  rapidly  with  frequency.  When  the  variation  of  impedance  with  frequency  is  considered,  the  input  reflection 
coefficient  remains  unaltered  at  the  lower  frequencies,  but  becomes  oscillatory  at  the  higher  frequencies,  where  the 
components  of  the  input  pulse  have  negligible  amplitude.  That  explains  why  the  frequency  dependent  characteristic 


Figure  3:  Variation  of  the  effective  dielectric  constant  with  frequency  and  position  along  the  tapered  microstrip  line 
with  Z/Z  =3. 


Figure  4:  Variation  of  the  characteristic  impedance  with  frequency  and  position  along  the  tapered  microstrip  line  with 
Z,/Z=3. 
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impedance  does  not  have  appreciable  effect  on  the  pulse  shape.  In  the  case  where  the  input  signal  has  much  wider 
frequence’  spectrum,  either  due  to  extremely  narrow  pulses  or  to  pulses  modulated  by  a  very  high  frequency  carrier,  the 
impedance  dispersion  must  be  taken  into  account. 


Figure  5;  Propagation  of  Gaussian  pulse  along  uniform  and  tapered  microstrip  lines  considering  the  frequency 
dispersion  of  the  effective  dielectric  constant  and  characteristic  impedance. 


Figure  6:  Frequency  spectrum  of  the  input  pulse  and  variation  of  the  total  input  reflection  coefficient  of  a  tapered 
microstrip  line  (Zj/Zi=2)  with  frequency,  with  (solid  line)  and  without  (dashed  line)  impedance 
dispersion. 
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3  2 -Experimental  Results 

To  verify  the  adequacy  of  the  formulation  presented  here,  several  experiments  were  carried  out  with  the 
microstrip  lines  described  in  Section  3,1.  In  order  to  provide  a  50a  enviroment  for  the  measurements,  all  the  lines  were 
mounted  "back  to  back".  A  Hewlett-Packard  8133A  pulse  generator  and  an  HP  1817A  sampling  oscilloscope  were  used  in 
the  experiments.  The  pulse  at  the  generator  output  was  measured,  and  then  incorporated  in  the  simulation  algorithm.  The 
experimental  and  simulated  results  are  presented  in  Figures  7,  8,  and  9,  for  an  uniform  50Q  microstrip  line,  and  for  two 
tapered  microstrip  lines  (impedance  transformers:  50-25-50r2  and  50-16-50fi),  respectively. 

Figure  7  shows  the  results  obtained  for  the  uniform  microstrip  line,  where  the  dashed  curve  corresponds  to  the 
input  pulse,  which  is  approximately  rectangular  and  exhibits  distortion  due  to  dispersion  and  reflections  along  the 
connecting  cables.  The  other  two  curves  in  this  figure  correspond  to  the  simulated  and  measured  output  pulses.  A  very 
good  agreement  is  observed  between  the  simulated  and  measured  output  pulses,  except  for  a  smaller  amplitude  in  the 
latter.  This  is  due  to  the  fact  that  the  microstrip  losses  (dielectric,  ohmic  and  radiation  losses)  are  not  accounted  for  in  the 
simulation.  Also,  the  measured  pulse  incorporates  effects  due  to  the  launchers  and  connectors,  which  introduce  extra 
dispersion,  and  may  be  responsible  for  the  variable  time  delay  seen  in  the  measured  pulse  when  compared  with  the 
simulated  one.  Figures  8  and  9  show  similar  results  for  the  two  tapered  microstrip  lines.  Once  again,  the  simulated  and 
measured  pulses  are  in  ver>'  good  agreement. 


Figure  7:  Simulated  and  measured  pulses  along  a  50mm  long  uniform  microstrip  line 
(^=0.588,  h=0.635mm,  e=10.5) . 


3.3.-CPW-slotline 

A  combined  CPW-slot  line  500-to-100  impedance  transformer,  25mm  in  length,  w’as  also  considered, 
on  a  0.635mm  thick  substrate,  with  s=38.  For  the  SDA  calculation,  the  transformer  was  divided  in  20  sections  of 
1.25mm.  After  the  frequency  domain  characterization,  a  50ps  FWHM  Gausssian  pulse  was  launched  at  the  low 
impedance  side  of  this  transmission  line  transformer  (TLT).  The  dispersed  pulse  is  shown  in  Figure  10.  For  comparison, 
two  other  pulses  are  also  shown  in  this  figure:  the  one  obtained  after  the  same  length  of  uniform  line,  whose  transversal 
dimensions  are  those  of  the  low  impedance  side,  and  one  obtained  without  considering  the  transmission  line  dispersion.  It 
is  again  seen  in  this  figure  that  the  main  cause  of  pulse  distortion  is  the  transmission  line  dispersion,  which  introduces 
some  ringing  at  the  head  and  tail  of  the  pulse,  and  so  reduces  its  amplitude,  due  to  a  redistribuition  of  energy. 


Figura  8:  Simulated  and  measured  pulses  at  a  SOmm  long  tapered  microstrip  line  {ZJZf-1,  h-0.635nun,  e-10.5). 


Figure  9;  Simulated  and  measured  pulses  at  a  50mm  long  tapered  microstrip  line  (Z/Zj-S,  h-0.635mm,  e-10.5). 


4.-  CONCLUSIONS 

This  paper  presented  a  rigorous  investigation  of  the  propagation  characteristics  of  ultra-short  electric  pulses 
along  uniform  and  tapered  transmission  lines,  considering  their  fully  dispersive  nature.  Both  Gaussian  and  rectangular 
pulses  were  simulated.  It  was  observ'ed  that  the  variation  of  the  characteristic  impedance  with  frequenc}^  affects  the  total 
input  reflection  coefficient,  which  acquires  an  oscillatory  behaviour  at  the  higher  frequencies.  However,  it  was  shown  that 
it  does  not  contribute  much  for  pulse  distortion,  except  in  the  case  of  extremely  narrow  pulses,  or  pulses  modulated  b>' 
very  high  frequency  carriers.  In  general,  the  major  cause  of  pulse  distortion  is  the  non-linear  relation  between  the 
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effective  dielectric  constant  and  frequency.  The  adequacy  of  the  formulation  was  confirmed  experimentally  for  microstrip 
lines:  simulated  and  experimental  results  showed  veiy  good  agreement. 


GAUSSIAN  PULSE  PROPAGATION  ON  TLT 


Figure  10;  Dispersed  pulses  for  exponential  CPW-sIot  line  taper. 
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WAVE-FIELD  PATTERNS  ON  ELECTRICALLY  LARGE  NETWORKS. 
Ross  A.  Sped  ale 
Redondo  Beach,  California 


1  -  MULTIDIMENSIONAL  GUIDED-WAVE  PROPAGATION. 

Recently  formulated  extensions  of  the  old,  classical  concepts  of  Image  Impedance^ 
and  of  Image  Transfer  Function,  from  the  basic  theory  of  simple  two-port  networks 
to  the  broader  context  of  large,  complex  systems  of  interconnected  multiport  networks, 
lead  to  new,  and  far-reaching  generalizations  of  well  known,  elementary  results. 

The  new,  generalized  results  describe,  in  rigorous  mathematical  terms,  the 
simultaneous  propagation  of  multiple  sets  of  guided  electromagnetic  waves  through 
electrically  large,  multi-dimensional  del  ay -structures,  constructed  by  interconnecting 
large  numbers  of  microwave  multiport  networks. 

Closed  form  matrix  expressions  of  new  multidimensional  quantities  are  given, 
that  are  conceptually  equivalent  to  the  classical  wave-impedance,  and 
propagation-constant  of  electrical  transmission  lines.  All  the  given  matrix  expressions 
are  cast  in  mutually  complementary  formulations,  one  set  being  suited  to  the  analysis 
of  wave-guiding  system,  the  other  set  intended  for  system  synthesis  and  design. 

2  -  PRACTICAL  APPLICATIONS  OF  THE  NEW  RESULTS. 

Numerous  and  varied  applications  of  new  analysis  and  synthesis  methods,  based 
on  the  here  presented  generalized  theory  of  gulded-wave  propagation,  have  already 
been  identified  and  studied  in  substantial  depth.  The  most  notable  of  these 
applications  is  perhaps  the  development  of  conceptually  new,  more  affordable 
configurations  for  the  feed-networks  of  large,  two-dimensional,  high  directivity, 
low-sidelobe,  electronically-steered  phased  arrays  [1,  2]. 

3  -  A  VASTLY  EXPANDED  FIELD  OF  APPLICABILITY. 

The  generalized  theory  of  guided  wave  propagation  presented  here,  applies  to 
2D  and  3D  wave-guiding  systems  of  interconnected  multiport  microwave  networks,  that 
may  physically  span  tens,  hundreds,  or  even  thousands  of  free-space  wavelengths  In 
all  linear  dimensions.  Further,  the  considered  network  systems  may  be  simultaneously 
excited  by  any  arbitrarily  large  number  of  mutually  coherent  sources,  connected  to 
the  network  system  at  an  equally  large  number  of  input  ports.  The  input  ports  may 
be  located  in  any  arbitrary  or  prescribed  pattern,  and  the  sources  may  conform  to 
any  given  arbitrary  or  prescribed  amplitude  and  phase  distribution. 

The  new  generalized  theory  of  guided  wave  propagation  is  applicable  to  the 
analysis  and  synthesis  of  practical  microwave  guiding  structures,  built  by 
interconnecting  well-known,  established  types  of  basic  microwave  devices  such  as 
coaxial  lines,  waveguides,  resonant  cavities,  couplers,  hybrid  junctions,  power 
splitters,  power  combiners,  matching  networks,  and  any  other  type  of  microwave 
device  known  to  be  linear,  at  least  below  some  given  level  of  power  density. 
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The  emphasis  of  the  new  treatment  is,  however,  on  periodic  structures  having 
a  substantial  degree  of  moduiarityy  as  multi-atom  molecules,  or  crystal  lattices. 

The  combined  modularity  and  periodicity  of  the  considered  microwave  structures 
also  implies  some  type  of  structural  symmetry,  including  some  of  the  point-symmetries 
of  crystallography.  The  new  generalized  theory  of  guided  wave  propagation  applies 
also  to  infinite  or  semi-infinite  wave  guiding  systems,  regarded  as  essentially  limitless 
Interconnections  of  elementary,  partial  regions^  or  sub-arrays. 

4  -  MULTI-DIMENSIONAL  WAVE-FIELDS. 

The  2D  or  3D  wave-field,  generated  by  each  of  the  external  excitation  sources, 
fills  the  whole  guiding  microwave  network  structure  with  position-dependent 
amplitudes  and  phases.  The  local  wave  amplitude  and  phase  of  each  single-source 
wave-field  depends  on:  a)  the  location  of  the  selected  observation  point,  b)  the 
location  of  the  source  connection-point,  c)  the  amplitude  and  phase  settings  of  the 
given  source,  d)  the  wave-propagation  properties  of  the  microwave  guiding  structure, 
and  e)  the  internal  impedances  of  all  the  sources,  and  of  the  passive  load-networks 
connected  to  the  system  external  ports. 

Further,  because  of  the  assumed  system  linearity  and  reciprocity,  linear 
superposition  of  the  component  wave-fields  of  all  the  active  system-excitation  sources 
takes  place,  everywhere  through  the  whole  wave-guiding  structure.  The  resulting, 
total  2D  or  3D  wave-field  pattern  is  therefore  a  weighted  linear  combination  of  the 
wave-fields  of  all  the  active  sources,  with  complex  weights  determined  by  the 
geometrical  locations,  relative  amplitudes  and  relative  phases  of  ail  the  sources. 

The  old,  well-known  concept  of  ^artificial  delay  line’  is  here  generalized  to  2D, 
and  3D  microwave  structures,  by  considering  large-scale,  periodic  systems  of 
interconnected  multiport  microwave  devices,  that  may  physically  span  large  numbers 
of  free-space  wavelengths. 

5  -  ANALYTIC  RESULTS. 

The  known  open-circuit  impedance  matrix  of  the  (  n  +  N)  -  port  network 
N  (shown  in  Figures  1  and  2)  is  asymmetrically  -  partitioned  in  four  blocks  Z:  having 
dimensions  consistent  with  the  (  n  [  W  )  -  split  of  the  total  number  of  ports,  between 
the  n  -  port  interface  1  ,  and  the  N  -  port  interface  2. 

The  matrix  is  therefore  partitioned  in  an  n  x  n  leading  block  Zj  ,  and 

an  N  X  N  trailing  block  Z^  ,  while  the  remaining  blocks  Z^  ,  and  Zj  are  both 
rectangular  and  with  dimensions  n  x  N  and  N  x  n  respectively.  Assuming,  for 
instance,  that  n  <  N  the  partitioned  matrix  is  represented  In  the  form  : 


^3 
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where,  because  of  the  assumed  reciprocity  : 

Zl  =  Z,  (2)  zl  =  Z,  (3)  Z,  =  7i  (4) 


Two  very  fundamental  matrix-algorithms  describe  mathematically  the 
transformation  of  the  open-circuit  impedance  matrices  ,  and  ^  totally  general 
and  arbitrary  load-networks  ,  and  L2  ,  to  the  corresponding  input  impedance 
matrices  Zj-^^  ,  and  Zj-^^  ,  expressed  by  : 

-  Z2‘  ‘  (s) 


Unconditional,  bilateral  matching  of  the  network  N  is  obtained  if  the  matrices 
Z,j  ,  and  Z,;  of  the  two  load-networks  ,  and  are  respectively  equal  to  the  two 
Image  Impedance  Matrices  Zj,  ,  and  Z^^  expressed  as  functions  of  the  four  blocks  Zj 
of  the  Z-matrix  by: 


4  ~  Z2'  Z^  '  Z^’  Zl 


Zn  -  (4-z3-zr‘-Zj-z;‘r-^* 

Further,  under  the  obtained  unconditional,  bilateral  matching,  the  input  voltages 
and  currents  are  mapped,  from  one  interface  to  the  other,  by  the  Image  Transfer 
Function  Matrices  %  ,  ,  T^f  ,  and  of  the  (  n  A/  )  -  port  network  N  , 

defined  by: 

Vj  =  (9)  -Ij  =  (10) 


V,  =  T^-Vj  (11)  =  T^'lj  (12) 


where  V',  and  !,•  are  the  n  -  dimensional  ’voltage-’  and  ’current-’  vectors  at  the  n  - 
port  Interface  1,  and  Vi  and  L  are  the  N  -  dimensional  ’voltage-’  and  ’current-’ 
vectors  at  the  N  -  port  interface  2  . 
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The  two  ’forward’  image  transfer  function  matrices  Tt^  (for  voltage  vectors), 
and  Tjjf  (for  current  vectors),  defined  by  the  expressions  (b)  and  (10)  ,  are  both  A/ 
X  n  ,  ^d  linearly  map  the  ’input’  ,  n  -  dimensional  voltage  and  current  vectors 
and  I:  ,  of  the  n  -  port  ’input’  interface  1,  to  the  N  -  dimensional,  ’output’  voltage 
and  current  vectors  ,  and  h  ,  of  the  N  -  port  ’output’  interface  2  (Figure  1). 

Similarly,  the  two  ’backward’  image  transfer  function  matrices  (for  voltage 
vectors),  and  (for  current  vectors),  defined  by  the  expressions  (l1)  and  (12)  , 
are  both  n  x  W  ,  and  linearly  map  the  ’input’ ,  N  -  dimensional  voltage  and  current 
vectors  Vj  ,  and  L  ,  of  the  N  -  port  ’input’  interface  2,  to  the  n  -  dimensional, 
’output’  voltage  ana  current  vectors  ,  and  Ij  ,  of  the  n  -  port  ’output’  Interface 
1  (Figure  2). 

The  four  Image  Transfer  Function  Matrices  Tj^p  ,  ,  and  Tppg  are 

expressed  as  functions  of  the  four  blocks  2,  of  the  Z-matrix  By: 


Tjy.  ■  ^2-  Z.'  •  [  4  M  I""  1’‘ 


'  24*  -  2,  -  2;‘-  Z,  ■  24 '  f  ]■'•  z. 


Tu,  -  2i-‘  2,  -  Z4'-  Zj  ■  Z;'  f  ]■*•  Z^ 


(13) 

(14) 

(15) 

(16) 


Two  sets  of  closed-form  expressions  of  the  blocks  2,-  of  the  open-circuit 
impedance  matrix  of  the  (  n  +  N  )  -  port  network  N  ,  formulated  as  functions  of 
two  mutually  equivalent  sets  of  image  matrices,  have  been  found  and  are  given  by  : 


IVF 


K  ^wb'  ^jkf 


^2  =  2 


2’/KB*  ^J2 


^3  “  2  (Iff  Tjyp‘  Tjyg  j  ^  •  Tjyp*  Zjj 


(17) 


(18) 


(19) 
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^4  "  (  ^ivf‘  ^iyb)  ’  {^n  ^ivb)^ 


^\  ~  "^n  '  [K  ^np  ]  '  {^n  ^bb*  ’^uf 


^Z  ~  ^  ^11  '{  ^Ub'  ’^BP  )  ^  ‘  '^BB 


^3  “  2  Zj2  '  (Iff  Tjjp*  Tjjg  j  ^  *  Tjjp 


^4  “  -^c  *  (  4?  ■'■  ^izf'  ’^m  )  *  (  ^//  ^np‘  '^sb  ) 


These  expressions  solve  the  practical  engineering  problem  of  designing  a  wave- 
guiding  microwave  network  structure  that:  a)  is  unconditionally  and  bilaterally 
matched  to  given  multi-phase  generators  and  multiport  load-networks,  and  b)  exhibits 
specified  amplitude  and  phase  transmissions  of  arbitrary,  multiple  sets  of  input  waves. 
In  both  the  forward  and  backward  directions.  A  typical  practical  example  is  the 
design  of  the  feed-network  of  a  transmit/receive  phased  array. 

Finally,  four  ’inverse^  image  transfer  function  matrices  ,  ffjjf  ,  and  /?jjg 

defined  by  : 


~  ^IVF  ‘  ^ 


■^BB  *  A 


are  expressed  by  : 

^iVF  =  ^  ] 


-1  .  ^-1  \-V2 


■  -^2'  ^4  ^*  {  ^3  '  ’  -^2*  “^4 
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(30) 


=  z,-‘-  [/„-(/,-  Zj-  z;‘-  Zj-  zr')"'  ]■“•  ^ 


(31) 


''jra 


(32) 


The  ’inverse’  matrices  Rjut  ,  and  are  both  n  x  N  ,  while  the  ’inverse’ 
matrices  Rj^  ,  and  are  both  Nx  n  .  The  expressions  (29)  through  (32)  solve  the 
practical  engineering  synthesis  problem  of  determining  the  input  excitation  vectors 
V:  ,  and  J,  required  to  generate  prescribed  output  wave-field  patterns  ,  and  Ij  . 

ElecWically-large,  two-dimensional,  reciprocal  wave-guiding  structures,  with  a 
square  outer  perimeter  and  lattice,  exhibit  rotation-,  and  reflection  symmetries, 
and  are  represented  by  complex,  symmetric,  4x4  block-circulant  Z-matrices  [4,5]. 

Similarly,  structures  with  a  regular-hexagonal  outer  perimeter  and  lattice  exhibit 
Ctu  symmetries,  and  are  represented  by  complex,  symmetric,  6x6  block-circulant 
matrices.  The  study  of  the  symmetry-structure  of  the  corresponding  Image-Impedance, 
and  Image-Transfer-Function  matrices  is  the  objective  of  current,  intensive  research. 
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SCATTERING  CHARACTERISTICS  OF  DISSIMILAR  WAVEGUIDE  SLOT 

COUPLERS 
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ABSTRACT  :  Characteristics  of  dissimilar  orthogonal  rectangular  waveguides  coupled  through  inclined 
coupling  slot  has  been  studied.  A  moment  method  solution  is  used  with  entire  basis  and  testing  functions. 
Numerical  results  for  resonant  length  and  scattering  parameters  over  a  rang  of  non-identical  mainline  and 
branchline  w^aveguide  dimensions  are  presented.  Moment  method  code  developed  has  been  validated 
experimentally  by  designing  novel  test  fixtures. 


INTRODUCTION 

Planar  slotted  arrays  are  widely  used  in  radar  and  microwave  communication  systems  because  of  their 
rugged,  compact  structure.  Such  planar  arrays  require  coupling  slots  as  feeding  elements  m  addition  to  radiatmg 
elements.  A  centered-inclined  slot  located  in  the  common  broad  wall  of  radiating  and  feeding  (hereafter  called 
branchline  and  mainline)  waveguides  is  widely  used  coupling  element  because  of  its  better  off  resonance 
behaGor.  A  fully  assembled  slotted  array  with  feed  network  is  difficult  to  test  because  many  of  the  desired  points 
are  maccessible^  Therefore,  it  often  is  desirable  to  test  separate  sub  array  sections  using  dimensions  of  actual 
arra\-  w  hich  requires  isolated  coupling  slot  characterization.  Because  of  the  electrical  and  mechanical  constraints, 
most  of  the  slotted  arrays  have  non-identical,  non-standard  mainline  and  branchline  dimensions.  The  low  side 
lobe  arra>'  design  requires  that  the  impedance  of  isolated  coupling  slot  located  in  the  common  broad  w  all  of  tw^o 
dissimilar  waveguides  be  known  to  a  high  degree  of  accuracy'. 


Waveguide  slot  couplers  have  been  discussed  in  the  literature  by  many  investigators  [1-5]  ,  A  rigorous 
analysis  of  centered-inclined  slot  coupler  has  been  made  by  Rengarajan  [6],  who  developed  mtegral  equations  for 
the  slot  E-field,  taking  into  account  finite  wall  thickness,  with  a  solution  obtained  usmg  method  of  moments.  He 
presented  numerical  results  for  resonant  length  and  scattering  parameters  over  a  range  of  slot  inclinations, 
frequencies  and  waveguide  dimensions.  Most  of  his  results  deal  with  identical  mainline  and  branchline  waveguide 
dimensions.  The  present  paper  deals  with  dissimilar  waveguide  slot  couplers  .  Formulation  used  involves  the 
moment  method  solution  of  a  pair  of  coupled  integral  equations  containing  dyadic  Green  fimctions  of  the 
homogeneous  isotropic  lossless  filled  mainline  and  branchline  waveguides.  Method  of  analysis  is  similar  to  [6] 
with  certain  modifications  in  slot  resonant  length  and  scattering  parameters  calculations.  Due  to  the  complexit}' 
of  the  boundary  value  problem  encountered  in  the  coupling  slot  modeling,  all  anal>tical  and  numerical  analysis 
mvolve  a  set  of  reasonable  approximations.  Although  moment  method  solution  offers  a  high  degree  of  accuracy 
and  provides  inexpensive  alternative  to  measurements,  it  is  essential  to  cross  check  computed  data  with  some 
experiment  to  have  an  estimate  on  the  error  and  to  make  appropriate  adjustments  in  the  final  array  design.  Here 
an  attempt  has  been  made  to  validate  theoretical  data  experimentally  by  designing  novel  test  jigs. 
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THEORY 


The  slot  is  assumed  to  be  excited  by  TE^q  mode  from  a  matched  generator  and  terminated  in  a 
match  load.  The  coordinate  system  of  the  structure  is  shown  in  Fig.  la  and  lb.  The  structure  is  divided  into 
the  regions  A,  B  and  C  where  A  is  mainline  waveguide  region,  B  is  slot  (or  cavity)  region  and  C  is 
branchline  region.  In  Region  A  and  C,  slot  is  completely  shorted  and  magnetic  current  sheets  and 
are  placed  in  its  location  at  y  =  b"  and  y  =  b+t^.  In  region  B,  both  the  slot  apertures  are  shorted  by  magnetic 
current  sheets  -  and  -  .  The  problem  is  solved  by  making  use  of  couple  of  integral  equations  for  slot 

electric  field  which  can  be  arrived  at  by  equating  magnetic  fields  on  the  two  sides  of  the  slot  interface.  The 
slot  electric  field  is  assumed  to  be  varying  along  the  length  with  no  variation  across  the  width. 

A  pair  of  coupled  integral  equations  obtained  after  enforcing  the  continuity  of  longitudinal  magnetic 
fields  across  each  slot  aperture  is 

h::’-h:,  =  o 

where  is  the  incident  tangential  magnetic  field,  given  by 

[  7COs(TD:/a,„)cos9 - ^^sm(jix/a„)sin0  ]  ...(3) 

where  is  the  mainline  waveguide  "a"  dimension. 

The  //J'  is  the  scattered  magnetic  field  inside  the  waveguide  and  is  expressed  in  terms  of  aperture 
magnetic  current  M^,  through  respective  Green's  function 

//7=n[cos0  sinG]  [G"^ain]  [  cos0  sinG  f  ds'  ...(4) 

s' 

The  integration  is  performed  over  the  interior  slot  aperture  . 

[Gmainj  jg  the  internal  mainline  waveguide  Green's  function  which  is  similar  to  that  of  [7]  with  a,b  and  k 
replaced  by  a„  ,  b„  and  kj  where  kj  =  k  Srn 's  permittivity  of  the  dielectric  material  in  mainline 
waveguide. 

//„[  and  H^2  are  the  lower  and  the  upper  aperture  fields  in  the  cavity  due  to  the  magnetic  currents 
and  -  Mq2  respectively  and  are  given  by 

Ki  =  -  n  a/,,  ds'  -  n  g;,,  K2  ds’  ...(5) 

5,  52 

Kl  =  -  W  G/oca  ^al  ds’  -  If  G;„^  M^2  ds’  -(6) 

^2 


...(1) 

...(2) 
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where  and  are  cavity  region  Green's  functions  [8] 

is  the  tangential  magnetic  field  in  the  region  C  and  is  computed  from  magnetic  current  using 
branchline  internal  waveguide  Green's  function  i.e., 

J|  [  sine  cose  ]  [  G^J’anch]  [  gine  cosO  f  ds’ 

where  [  is  expressed  in  terms  of  branchline  waveguide  dimensions  and  dielectric  constant 

^b- 

The  integral  equations  (1)  and  (2)  are  solved  by  method  of  moments  using  Galerkin's  method  of  testing 
[9].  It  has  been  found  that  around  the  first  resonance  the  field  within  the  slot  is  essentially  sinusoidal  and  hence 
trigonometric  functions  can  be  chosen  as  basis  functions.  Let 

A' 

^  sinl5;<Q:l  +  i)/{2  L)] 

q=\ 

s 

^  5,  sm[9<al  +  L)/(2I)} 

<7^1 

where  Aq  and  Bq  are  unhnown  coefficients. 

As  Galarkin  method  is  used,  the  weighting  function  should  be  of  the  same  ty^pe  as  basis  functions,  namely 

=sm[/7.-<al  +  L)/(2L)] 

The  basis  function  should  satisfy  the  same  boundary  conditions  as  satisfied  by  the  magnetic  current  sources  in 
order  to  converge  magnetic  current  expansion  series.  The  moment  method  converts  the  integral  equations  into  a 
matrix  equation  w'hich  is  then  solved  for  unknown  coefficients  Aq  and  Bq  .  The  field  distribution  on  the  surface 
of  the  slot  can  be  obtained  once  unknown  coefficients  are  solved. 

NUMERICAL  RESULTS  AND  DISCUSSION 

The  centered-inclined  slot  is  modeled  as  a  series  element  The  resonance  condition  is  defined  wfien  slot 
magnetic  current  is  purely  real  and  thus  avoiding  previous  ambiguous  resonance  condition  [5].  The  scattering 
parameters  are  modified  in  order  to  satisfy  the  power  balance  condition  and  proper  normalization  has  been  done 
m  terms  of  mainline  and  branchline  w^aveguide  dimensions  and  corresponding  impedance  to  make  the  scattering 
matrix  symmetrical.  Based  on  moment  method  analysis,  a  software  package  has  been  developed  to  compute 
slot  length  and  other  scattering  parameters  at  resonance.  Secant  method  of  root  finding  has  been  used  for  faster 
convergence. 

Fig.  2  illustrates  the  variation  of  resonant  length  (2L)  as  a  function  of  slot  inclination  (0)  for  non¬ 
standard,  non-identical  mainline  and  branchline  waveguide  "a"  dimensions.  The  resonant  length  increases  with 
inclination  and  variation  is  more  in  the  case  of  reduced  height  waveguide  couplers.  Fig.  3  shows  the  variation  of 
resonant  length  with  inclination  where  mainline  is  a  standard  waveguide  while  branchline  is  non-standard. 
Variation  is  same  as  in  the  previous  case  but  slot  resonant  length  is  shorter  compared  to  previous  case.  Fig.  4 
and  5  show  the  back  scattered  wave  amplitude,  |S1 1|  at  resonance  as  a  function  of  slot  tilt  angle  for  two  different 
cases.  Waveguide  dimensions  affect  jSl  Ij  by  a  small  amount.  For  both  the  cases  |S1  Ij  =  0.5  at  0  =  45  deg.  where 
half  the  powder  is  coupled  in  the  branchline  waveguide.  Computed  results  show^  that  SI  1(0)  =  SI  1(90-0).  In  all 
these  computations  and  ejj  is  taken  as  1.0.  The  effect  of  different  and  has  also  been  studied.  For 
standard  half  height  waveguide  slot  coupler  with  =  2.0  and  st,  =1.0,  computed  results  show^  4.5%  decrease 
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in  resonant  length  for  0  lying  between  5  deg.  and  15  degrees.  For  higher  value  of  slot  inclinations  and  ,  the 
series  behavior  of  slot  is  found  to  be  poor.  When  =1.0  and  =  2.0,  the  reduction  in  resonant  length  is 
significant :  about  13%  for  6  lying  between  5  and  35  degrees.  A  sinailar  behavior  is  observed  in  the  case  of  non¬ 
standard  waveguide  slot  couplers  also. 


EXPERIMENTAL  VALIDATION 

To  validate  the  moment  method  code  developed,  a  novel  test  fixture  containing  two  blocks  with  coupling 
insert  plates  has  been  devised.  The  dimensions  of  the  waveguides  are  obtained  by  machinmg  two  identical  solid 
alummum  block.  In  the  first  block,  channel  of  required  mainline  waveguide  dimensions  (  22.532  mm  x  5.08  mm) 
is  machined  out.  In  the  second  block,  channel  of  required  branchline  waveguide  dimensions  (19.953  mm  x  5.08 
mm)  is  machined  out  in  such  a  manner  that  when  the  two  blocks  are  connected  together,  the  channels  are 
orthogonal  to  each  other.  In  order  to  have  a  common  wall  between  the  two  blocks,  a  coupling  insert  plate,  which 
has  the  same  geometry  as  blocks,  is  designed  To  obtain  more  than  one  data  point,  the  test  fixture  accommodates 
several  coupling  slot  geometries  by  virtue  of  different  insert  plates.  Indexing  pins  of  appropriate  dimensions  are 
provided  to  mach  mainline,  branchline  and  coupling  insert  plates.  Apart  from  this,  a  number  of  screws  at  a 
number  of  places  are  used  to  avoid  any  leakage.  A  photograph  of  test  jig  along  with  coupling  insert  plate  is 
shown  in  Fig.6.  The  test  fixture  can  be  viewed  as  a  four  port  transmission  line  network.  A  transition  from  22.532 
x  5.08  mm  to  22.86  x  5.08  mm  and  19.953  x  5.08  mm  to  22.86  x  5.08  mm  was  made  and  standard  coax  -to- 
waveguide  adopter  provided  by  HP  were  used  to  connect  the  test  jig  with  HP  85 10  vector  network  analyzer.  The 
network  analyzer  is  calibrated  for  two  port  using  standard  8036  half  height  calibration  kit.  Coupling  insert  plates 
with  slot  tilts  15,  25,  27.5,  32.5  and  35  degrees  were  fabricated  and  extensive  measurement  were  carried  out  for 
all  the  scattering  parameters.  The  results  for  iSllj  is  shown  in  Table-I  along  with  computed  results.  The 
agreement  between  computed  and  measured  results  are  seen  to  be  excellent. 

CONCLUSION 

Based  on  moment  method  analysis,  a  generalized  software  package  has  been  developed.  A  great  care  has 
been  taken  to  compute  slot  resonant  length  and  scattering  parameters  for  dissimilar  waveguide  slot  couplers.  The 
validity  of  the  moment  method  code  developed  has  been  checked  experimentally. 

REFERENCES 

[1]  W.H  Watson,  The  Physical  Principles  of  Waveguide  Transmission  and  Antenna  Systems,  London; 
Clarenden  Press,  1949. 

[2]  T.Vu  Khae  and  CT  Carson,"  Coupling  by  slots  in  rectangular  waveguides  using  with  arbitrary  wall 
thickness,"  Electron  Lett.,  vol.  8,  no.  18,  pp.  296-297,  Sept.  1972. 

[3]  P.K.  Perk  et  al,"  Shunt/series  coupling  slot  in  rectangular  waveguide,"  in.  IEEE  Int.  Antennas  and Propagat. 
Symp.  Dig.,  1984,  pp.  62-65. 

[4]  D.C.  Senior,”  Higher  order  mode  coupling  effects  in  shunt  series  coupling  junction  of  planar  slot  array 
antenna,"  Ph.D.  dissertation,UCLA,  1986. 

[5]  S.R.  Rengarajan,"  Characteristics  of  longitudinal  transverse  coupling  slots  in  crossed  rectangular 
waveguides,"  lEEE-MTT,  vol.  37,  pp.  1171-1 177,  1989 

[6]  S.R.  Rengarajan,"  Analysis  of  centered-inclined  waveguide  slot  coupler,"  lEEE-MTT,  vol.  37,  no.  5,  pp.  884- 
889  1989 

[7]  A.F.  Stevenson,"  Theory  of  slots  in  rectangular  waveguides,"  J.  Appl.  Phys.,  Vol.  19,  pp.  24-38,  1948. 


667 


[8]  S  .R,  Rengarajan/’  Compound  radiating  slots  in  the  broad  wall  of  rectangular  w  aveguide,"  lEEE-AP,  Vol.  37, 
pp.  1116-1123,  1989 

[9]  R.F.  Harrington,  Field  computations  by  Moment  Method,  New  York,  McGrawHill,  1969 

ACKNOWLEDGMENT 

Authors  wish  to  thank  Mr.  K.U.  Limaye,  Scientist  'F',  Electronics  and  Radar  Development 
Establishment,  Bangalore  for  his  constant  encouragement  and  valuable  suggestions  throughout  the  w'ork  and  Mr. 
N.P.  Ramasubba  Rao,  Director,  Electronics  and  radar  Development  Establishment,  Bangalore  for  his  kind 
permission  to  publish  this  work. 


Mainline  waveguide 


Fig.  la  Waveguide  &  Cavity  Regions 


■^cC. 


Fig.  lb  Coordinate  system  of  the  slot 
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Q>2  z;  0>2 


FIG-4  Variation  of  Mag.  S11  with  slot  tilt 


F1G.5  Variation  of  Mag.  S11  with  slot  tilt 
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Fig.  6  PHOTOGRAPH  OF  TEST  JIG  WITH  COUPLING  INSERT  PLATE 
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Waveguide  Dimensions  (mni) 

Mainline  :  22.532  *  5. OS 

Branchline  ;  19.953  *  5.08 

Slot  Dimensions  (mm) 

Width  ;  1.58 

Thickness  :  1.00 

Frequency  (GHz)  :  9.60 


Slot  Tilt  (  deg.)  Resonant  Length  (mra)  iSllI  in  dB 

Theoretical  Experimental 
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Abstract 

An  alternative  formulation  of  the  transverse  resonance  technique  is  presented.  The  differ¬ 
ence  between  the  usual  TRT  and  the  formulation  presented  here  is  the  considered  transverse 
equivalent  network.  With  the  TRT  proposed  formulation  mode  solutions  identification  requires 
a  less  arduous  work.  Numerical  results  are  presented  for  microstrip,  coupled  microstrips  and 
conductor-backed  coplanar  waveguide. 

I.  Introduction 

The  transverse  resonance  technique  (TRT)  can  be  applied  to  a  large  class  of  microwave  and 
millimeter- wave  passive  structures  problems.  It  has  been  used  not  only  to  obtain  dispersion 
characteristics,  but  also  to  the  characterization  of  a  large  variety  of  discontinuity  problems  in 
planar  and  quasi-planar  structures  [1],  [2].  In  this  paper  an  alternative  formulation  of  the  TRT  is 
presented.  With  the  proposed  formulation  open  side  structures  can  be  exactly  analyzed,  boxed 
modes  can  be  avoided  and  mode  solutions  identification  requires  a  work  less  arduous.  In  this 
manner,  not  only  dominant  modes,  but  also  higher  order  modes  can  be  studied  and  distinguished 
from  boxed  modes.  Numerical  results  obtained  by  the  proposed  TRT  formulation  are  presented 
to  microstrip,  coupled  microstrips  and  conductor-backed  coplanar  waveguide.  They  are  in  good 
agreement  when  compared  to  results  obtained  by  other  methods. 

II.  Theory 

In  the  conventional  formulation  of  the  TRT,  a  suitable  transverse  equivalent  network  is 
established  to  compute  the  cutoff  frequencies  and  possibly  some  additional  characteristics  of 
the  structures  [2].  The  difference  between  the  conventional  and  the  proposed  formulation  is 
the  adopted  equivalent  network,  and  consequently  the  admittance  matrix  expressions.  In  Fig. 
1  equivalent  networks  and  admittance  matrix,  for  the  usual  TRT  is  preesented.  The  adopted 
equivalent  network  and  admittance  matrix  expressions  for  the  proposed  TRT,  is  presented  in 
Fig.  2.  Mode  coupling,  which  occurs  at  each  step  discontinuity,  is  represented  by  a  generalized 
voltage  source.  Each  transmission  line  section  represents  the  different  structure  sections  (two 
homogeneous  waveguides  and  one  inhomogeneous),  and  the  admittances  represent  the  boundary 
conditions.  Matrix  admittance  elements  are  detailed  in  [3]. 
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III.  Numerical  Results 

In  order  to  exemplify  the  proposed  THT  formulation,  nuiiu'rieal  results  are  presented  here 
to  mierostrip  (MS),  coupled  niicrostrips  ((’MS)  and  conductor* barked  coplanar  waveguide 
((’B('VV)  (Fig.  3).  The  results  were  obtained  by  a  computer  jjrogram  on  a  personal  com- 
fjuter.  In  Fig.  4  results  for  the  effective  dielectric  constant  of  a  microstrip  are  presented.  When 
comj)ared  to  results  of  [4],  a  good  agreement  is  observed  not  only  to  the  dominant  mode  (£‘//o), 
but  also  to  the  higher  order  modes. 


Figure  3:  a)  MS  b)  CMS  c)  THCW 


In  Fig.  0  two  different  couf)led  niicrostrips  are  considered.  In  both  cases,  the  obtaine<l 
results  for  the  first  even  and  odd  modes  are  in  accordance  with  results  of  [5]. 

A  conductor- backed  cojilanar  waveguide  is  considered  in  Fig.  6.  Numerical  results  are  pre¬ 
sented  for  the  fundamental  and  higher  order  modes,  and  are  in  good  agreement  when  compared 
to  results  of  [6]. 

IV.  Conclusions 

An  alternative  TRT  formulation  is  presented  which  constitutes  a  versatile  method  for  the 
talculation  of  dispersion  characteristics  of  pratical  MICh  MHMK’  and  MMIC’  structures.  The 
effort  in  mode  solutions  identification,  especially  higher  order  inodes,  is  considerable  reduced. 
The  results  pre.sented  arc  in  accordance  with  results  obtained  by  other  authors. 
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F'igure  1:  Usual  'TKT 


One  advantage  of  the  proposed  formulation  is  that  open  side  structures  can  he  aiialyze<! 
exactly.  When  an  open  inicrostij)  is  considered,  l)y  the  usual  formulation,  two  shorting  end 
plates  are  placed  at  such  distance  of  the  center  strip  that  their  effects  in  microstrip  electro 
magnetic  fields  can  he  negligible.  The  prohlcrn  of  this  approach  is  thf'  presence  of  “l)oxed 
modes"  wliich  makes  the  ideiitification  of  specific  microstrip  modes  an  arduous  work.  When 
the  present  formulation  is  used,  the  two  shorting  end  plates  can  he  removed  an<l  infinite  trans¬ 
mission  lines  are  used  instead.  So,  the  work  in  mode  identification  is  considerably  reduced.  In 
addition,  equivalent  netw'orks  and  admittance  matrix  can  he  obtained  for  coufiled  microstrips, 
conductor-hacked  coplanar  waveguide  and  other  structures. 


Figure  2:  Proposed  TRT 
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Effective  dielectric  constant 


H  —  0.80?;??/?.  h  —  0.20//?;;?,  W  —  0.14//????, , S’  =  0.28???,  L  = 


Figure  6:  CBCW  -  Ce//  ■?'  Frcq.^Cil 
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ABSTRACT:-  The  dielectric  covered  radiating  slot  in  the  broad  wall  of  the  rectangular  waveguide 
has  been  analyzed.  An  alternative  method  of  using  both  electric  and  magnetic  vector  potentials  for  the 
derivation  of  spectral  domain  Green's  function  has  been  presented  The  efficiency  of  the  present 
method  has  been  pointed  out.  To  validate  the  analysis,  the  derived  dielectric  Green's  function  has  been 
applied  to  shunt  and  senes  slots.  The  numerical  results  have  been  compared  with  the  published  results. 
The  problems  encountered  in  the  numerical  integration  have  been  discussed 


1.  INTRODUCTION 


Waveguide  slots  arc  well  suited  for  array  applications  because  of  their  properties  like  control 
over  amplitude,  phase  and  polarization.  A  dielectric  substrate  layer  over  the  exterior  of  the  array  is 
used  extensively  for  practical  considerations.  The  presence  of  dielectric  cover  (or)  layer  will  detune  the 
individual  slots  which  in  turn  leads  to  pattern  detoriation.  Hence,  the  electrical  properties  of  the 
radiating  elements  with  dielectric  sheet  needs  tobe  studied.  The  dielectric  covered  slots  were  studied  by- 
Bailey  in  1969[1]  He  used  a  single  basis  function  and  derived  expressions  for  slot  admittance  by  using 
conservation  of  power  at  the  slot  aperture.  The  study  of  w'aveguide  slots  radiating  into  free  space  is 
well  advanced.  Rengarajan  studied  the  centered-inclined  slot  radiating  into  free  space(2).  Katehi 
characterized  the  dielectric  covered  longitudinal  offset  radiating  slot  using  moment  method[3]. 

This  paper  presents  a  detailed  study  on  the  dielectric  covered  longitudinal  offset  and  centered- 
inclined  radiating  slots  on  the  broad  wall  of  rectangular  waveguide.  Moment  method  has  been  used  for 
obtaining  the  resonant  characteristics  of  the  radiating  element.  A  simplified  method  has  been  followed 
in  the  derivation  of  external  Green’s  function  The  region  external  to  the  waveguide  is  divided  in  to 
two  layers  and  in  each  layer,  the  electromagnetic  fields  are  expanded  in  terms  of  a  set  of  vector  wave 
functions  namely  A2{x,y,z)  and  F^fx.y.z)  in  contrast  to  the  earlier  method[3].  The  tangential 
components  of  the  fields  are  matched  at  the  junctions  of  the  two  layers  and  the  tangential  magnetic 
field  due  to  a  dirac  delta  source  at  the  slot  aperture  which  in  other  words  is  the  external  Green's 
function  has  been  found  out.  The  resulting  expression  is  less  complicated  and  hence  computationally 
less  expensive  when  compared  to  the  approach  of  using  only  the  electric  vector  potential  for  the 
derivation[3]. 
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2.  THEORY 


The  geometry  of  the  slot  with  dielectric  cover  is  shown  in  Fig.l.  The  slot  is  assumed  to  be 
narrow  so  that  the  only  significant  component  of  the  electric  field  lies  in  x-direction.  The  slot  is 
covered  by  a  dielectric  layer  of  thickness  'h'. 

Using  Equivalence  principle,  the  x-directed  electric  field  is  replaced  by  a  z-directed  magnetic 
current  as 

M  =  n  X  E  . (1) 

In  the  absence  of  any  media  discontinuity,  the  z-directed  magnetic  current  gives  rise  to  only  the 
z-component  of  magnetic  vector  potential  F^.  But  when  the  magnetic  current  is  placed  at  the  media 
discontinuity,  in  addition  to  one  more  component  of  F  has  to  be  considered  to  satisfy  boundary 
condition.  One  approach  is  to  use  F^  and  Fy  components.  Second  approach  is  to  expand  the 
electromagnetic  fields  in  terms  of  the  vector  potentials  normal  to  the  media  discontinuity. 
Traditionally,  the  problem  of  dielectric  covered  slot  in  an  infinite  ground  plane  has  been  dealt  using 
first  approach[3].  The  method  presented  in  this  paper  uses  both  the  electric  and  magnetic  vector 
potentials  for  deriving  the  magnetic  field  Green's  function. 

For  the  sake  of  comparison,  the  spatial  domain  expression  of  the  Green’s  function  is  given 
below.  The  tangential  magnetic  field  at  the  slot  aperture  is  given  by[3]. 


where 


JJ 


//•').  A/, 


klGi 

with  Gf,  amiGy,  being  the  dielectric  Green’s  functions  given  by, 

^  _  -j(OE,£,  [  -I  -  —  I  A  i/cosh[H(-/i  +y)\+s,u^  sinh[ [/(-/> +  y)] 
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sinh(ij>^) 
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«cosh(M/j)+w„  sinh(w/i)  eu^  cosh{«/j)+Msinh(M/j) 


-dA 


where  and  u  =  yjA^  -  k]  .  ...(3.b) 

The  above  method  of  finding  the  Green’s  function  of  F.andFy  leads  to  the  presence  of  second 
order  derivatives  w.r.l.  y  and  z.  The  computation  of  the  partial  derivatives  of  dielectric  Greens 
functions  is  a  difficult  task.  An  alternate  approach  has  been  followed  here  to  reduce  the  complexity  of 
the  tangential  magnetic  field  at  the  slot  aperture.  In  the  approach  outlined  below,  the  problem  is 
formulated  in  such  a  way  that  the  Green’s  function  for  the  magnetic  field  is  found  out  directly,  thus 
avoiding  the  partial  derivatives. 

The  spatial  domain  and  the  spectral  domain  relation  is 

|j  Fe-^'^’e^'’’dk^dk^  ....(4.a) 
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k,,yX)  =  —  n 


The  electromagnetic  fields  in  the  dielectric  layer  and  in  the  free  space  are  expressed  in  terms  of 
a  set  of  vector  potentials  A‘(x,y,z)  and  F‘{x,y,z)  where 

A’ (x,y,z)  =  Ay{x,y,z)y  -■■■(5) 

F{x,y,z)^  Fy{x.y,z)y  ....(6) 

which  satisfies  the  wave  equation: 

(VXk^^)f{x,y,z)  =  0  ....(7) 

where /(x,y./,)  can  be  either  A'{x,y,z)  or  F’{x,y,z). 

The  electromagnetic  fields  are  related  to  the  vector  potentials  as 

//  =  ZM  +  _L[V(V.F)  +  Ar„V]  ....(8) 

M  J(^F 

E  =  -VXF-\--^\V{V.A)  +  klA\  ....(9) 

icons 

The  corresponding  Fourier  transformed  equations  are 


k  dAy 

K=-jKF — 

COflS  ^ 


~  =  k  SA' 

fI  =  Af; — 

COfJiE  4' 


ir  A-  4  k,  dFl 

n  con  X 


-jk^  Ay  k^  ^Fy 

n  con  dy 


Solving  Eq.(lO).  at  y  =  0,  for  and  Fy  gives 


=  c^Al  -kli^  . 

A’  =  — ^  =  ^  cons  ...(11a) 

{k^,+kj) 

.(ll.b) 

ikl+k]) 

The  solution  of  wave  equation  Eq(7),  subjected  to  appropriate  boundary  conditions  gives  rise  to 


J_  [u^  +js,u^  tan(;m^)]  _  J_ 
u/\s,u^+jujtan(huj)]  uy 
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p  V  [Wd+y»c  tan(/iWj)] 

where  u]  -  k]  -  k]  -  k]  and  mJ  -  kl  -  k]  -  k] . 
Substituting  the  Eq.(ll)  to  (12)  in  Eq.(lO)  leads  to 


{kl+k^u,  [k]+k])o}fi 


...(12.b) 


.,..(13) 


The  Eq.(13)  gives  the  magnetic  field  in  spectra!  domain  and  its  counter  part  in  spatial  domain 
is  obtained  from  Fourier  inverse  transformation.  The  advantage  of  this  formulation  is  the  complex 
spatial  derivatives  are  avoided  and  the  resulting  expression  is  simplified  It  expresses  the  magnetic 
field  directly  in  terms  of  slot  aperture  electric  field  The  spectral  functions  fg  and  ff  have  multiple  TE 
and  TM  poles.  Their  respective  contributions  can  be  computed  from  the  two  terms  on  the  right  hand 
side  of  Eq.(13).  Using  only  the  vector  potential  'F'  in  the  derivation  of  dielectric  Green's  function  leads 
to  the  presence  of  both  TE  and  TM  singularities  in  the  second  term  of  the  Eq.(13)  in  addition  to  the 
first  term.  This  complicates  the  evaluation  of  integral 


RESULTS: 


To  validate  the  procedure,  the  above  developed  theory  is  applied  to  solve  the  problem  of  slot 
radiating  in  to  finite  dielectric  slab.  The  expression  for  magnetic  field  derived  in  Eq.(14)  has  been  used 
as  external  scattered  magnetic  field  and  for  the  internal  scattered  fields,  Stevenson's  Green's  functions 
are  used(4].  The  unknown  slot  field  is  expanded  into  entire  domain  sinusoidal  functions.  The  coupled 
integral  equations  resulting  from  matching  the  tangential  magnetic  fields  at  the  top  and  bottom  slot 
apertures  are  converted  into  matrix  equation  using  moment  method.  The  matrix  equation  is  solved  by 
direct  inversion. 


A  generalized  moment  method  code  has  been  developed  which  uses  the  above  presented 
alternative  expressions  for  characterizing  the  longitudinal  offset  (shunt)  and  the  centcred-inclined 
(series)  radiating  slots.  To  validate  the  approach,  the  method  presented  here  has  been  applied  for 
computing  the  resonant  length  and  the  resonant  conductance  of  an  isolated  shunt  slot  radiating  into  a 
dielectric  slab  of  0^=2. 62  and  a  thickness  of  0.062"  (Fig, 2  and  Fig  3).  The  results  are  presented  for 
various  slot  offsets  with  and  without  wall  thickness  and  are  found  to  be  in  good  agreement  with 
published  rcsults[31.  For  a  centered-inclined  radiating  slot,  the  resonant  length  and  the  resonant 
conductance  have  been  computed  for  ej=1.0  and  0^=2  62.  The  results  are  presented  in  Fig  4  and  Fig. 5 
over  a  range  of  slot  tilts.  For  the  case  of  6^=1, 0,  as  can  be  observed,  the  data  obtained  by  using  the 
alternate  Green's  function  matches  well  with  the  earlier  published  results[2].  Because  of  the  lack  of 
published  literature,  the  dielectric  covered  centered-inclined  radiating  slot  data  has  been  presented 
without  any  comparison.  Since  the  above  developed  theory  has  already  been  validated  for  longitudinal 
offset  slot,  the  theory  and  the  results  generated  are  reliable  for  centered-inclined  case  also. 
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CONCLUSION: 


A  simplified  expression  has  been  derived  for  the  dielectric  Green’s  function  of  a  slot  radiating 
in  to  a  finite  dielectric  slab.  The  above  developed  theory  has  been  applied  to  characterize  the 
longitudinal  offset  and  the  centered  inclined  radiating  slots.  It  has  been  found  that  the  Green's  function 
presented  in  this  paper  reduces  the  computational  complexity  without  effecting  the  accuracy  of  the 
moment  method. 


REFERENCES: 

[1]  M  C.  Bailey, "The  impedance  properties  of  dielectric-covered  narrow  radiating  slots 
in  the  broad  face  of  a  rectangular  waveguide,"  IEEE  Trans.  Antennas  Propagat., 
vol.  AP-18.pp. 596-603,  Sept.  1970. 

[2]  S.R  Rengarajan,"  Scattering  characteristics  of  centered-inclined  slot  in  the  broad 
wall  of  a  rectangular  waveguide,"  lEE  Proc.H,  1 990, 1 37,  pp. 343-348. 

[3|  P  B.  Katehi,"  Dielectric  -  covered  waveguide  longitudinal  slots  with  finite  wall 
thickness",  IEEE  Trans.  Antennas  Propagat.,  pp  1039-1045,  July  1990. 

(4|  A.F.  Stevenson,"  Theory  of  slots  in  rectangular  waveguides’,  J.  Appl.  Phys.,  19, 
pp. 24-38. 


ACKNOWLEDGMENTS: 

Authors  wish  to  thank  Mr  KU  Limayc,  Sc'F’,  LRDE  for  the  facilities  provided  without  which 
this  work  would  not  have  been  completed  and  Mr.  NP  Ramasubba  Rao,  Director,  LRDE  for  his  kind 
permission  to  publish  this  work. 


a  a 

(a)  Longitudinal  offset  radiating  slot  (b)  Centered-inclined  radiating  slot 

Fig.l  Dielectric  covered  Slot  in  the  Broad  wall  of  Rectangular 
Waveguide 
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Abstract 

The  E-field  distribution  of  low  gain  antenna  on  conducting  body  will  help  to  arrange  antennas  and  solve 
Electromagnetic  Compatibility  (EMC)  problems.  In  this  article,  resonant  linear  antenna  on  conducting  body  of 
revolution  (BOR)  is  discussed.  Moment  method  and  Fourier  analysis  method  are  used  to  obtain  the  BOR  surface 
current  excited  by  the  antenna.  From  the  antenna  current  and  the  surface  current,  the  E-field  distribution  on  BOR 
surface  is  obtained.  The  distribution  results  are  shown  in  figures,  and  the  variation  of  E-field  strength  over  the 
whole  body  surface  can  be  expressed  out  clearly. 

1 1ntroduction 

With  the  developing  of  space  technology,  spacecrafts  are  becoming  more  and  more  complex.  It  is 
essential  to  arrange  all  the  spacecraft  antennas  reasonably  and  solve  the  Electromagnetic  Compatibility'  (EMC) 
problems.  High  gain  antennas  have  narrow  radiation  patterns,  and  their  influence  on  other  antennas  is  small.  On 
the  other  hand,  low  gain  antennas  have  wide  radiation  patterns,  their  influence  is  strong.  The  distribution  of 
surface  current  and  E-field  excited  by  a  low  gain  antenna  on  the  spacecraft  surface  is  helpful  to  antenna 
arrangement  and  EMC.  In  articles  [l][2][3],  the  computation  and  figures  of  surface  current  distnbution  excited 
by  a  resonant  linear  antenna  on  conducting  body  of  revolution  (BOR)  have  been  discussed.  In  this  article,  we 
focus  our  discussion  on  the  E-field  distribution  under  the  same  conditions.  For  the  reason  of  simplification,  the 
current  distribution  on  the  resonant  linear  antenna  is  assumed  to  be  sinusoidal  standing  wave  distribution,  the 
body’s  influence  on  the  antenna  current  distr  ibution  is  ignored,  and  the  conducting  body  of  revolution  is  limited 
with  geometry  for  resonant  region. 

The  time  factor  is  . 

2  Analysis 

The  conducting  body  of  revolution  to  be  discussed  here  and  its  coordination  are  shown  in  Fig  I.  A 
resonant  linear  antenna  with  arbitrary  shape  is  attached  on  the  body.  By  solving  the  magnetic  field  integral 
equation  (MFIE),  the  surface  current  can  be  obtained[4].  From  the  integral  of  the  antenna  current  on  antenna 
itself  and  that  of  the  surface  current  over  BOR  surface,  the  E-field  distribution  is  obtained. 


2.1  The  Magnetic  Field  Integral  Equation  and  Einearizationf4] 

From  the  Maxwell’s  equations,  the  following  MFIE  for  the  surface  current  density 
conducting  body  in  an  incident  field  with  magnetic  field  strength  may  be  derived: 


af  p)  =  la''' (p) - 0)7  y.a(q)-xV 

2/T 


,,-jkp 


-ds' 


P 


(2.1.1) 


*  This  work  was  supported  by  IVational  Nature  Science  Foundation  of  China. 


a  on  the  surface  of  a 
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Fig  1  Conducting  Body  of  Revolution  and  Its  Coordination 
Here,  n  is  an  outward  unit  noirnal  vector  to  the  surface,  p  is  the  distance  between  the  observation  point 
p  and  the  source  point  q  on  area  element  ds k  is  the  wave  number  in  free  space,  and 
T])  =  fix  H‘'’‘  ig,  /;) .  The  integral  is  carried  out  on  the  body  surface.  The  symbol  V'  denotes  a  gradient 
with  respect  to  the  primed  coordinates. 

In  order  to  express  the  surface  current  density  in  tenns  of  generatrix  line  length  v  and  azimuth  anglc^,  a 
new  coordinate  system  (u.v,(p)  is  introduced. 

The  j p  can  be  expressed  as  Fourier  series; 

^{v^(p)=  ^5-  (v)v  +  cr  {v)^)e (2.1.2) 

711  VJ7)  wtn 

f  =  y  (<T'’"(v)v-  +  a'"‘(v)p)£.''”’'  (2  1,3) 

x  w-  y 

12^  =  y  N  [upi (2,1.4) 

P 

Using  (2.1.2).  (2.1.3)  and  (2.1.4),  the  integral  equation  (2.1.1)  can  be  transfened  into  the  following 
integral  equations  for  each  Fourier  haimonics  (m)  of  v  and  components. 

a  =^2c7’"'  -\{P  <7  +  P  (7  )R'Liv'  (2.1.5) 

via  vm  mil  vm  m\l  (piii 

cr  =2(7^'  -\{P  a  +P  (7  )R'dv'  (2.1.6) 

{fiu  (fyn  ni2]  vm  mil  (/wii 

PmH,  Pmi:,  PnCh  Pm:2  3^6  thc  functioHS  of  0,  O’,  Z,  Z  \  R,  R S„,. 

Using  pulse  function  for  expansion  and  point-matching,  the  integral  equations  (2.1.5)  and  (2,1.6) 
become  linear  equations. 

P 

The  primary  surface  current  associated  with  the  incident  H“’'  from  the  linear  antenna  is  still  unknown, 
and  will  be  discussed  in  next  section.  Solving  (2.1.7),  the  current  distribution  g  on  BOR  surface  can  be 
obtained. 
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2.2  The  Formulas  of  Primary  Surface  Current  Associated  with  the  Incident  from  Arbitrary 
Linear  Antenna[4j 

A  source,  whose  volume  cujrent  density  is  ./,  produces  the  incident  //""  ,  which  produces  The 


— 

/ 

/ 

r 

7 

r\'P 

/?>=' 

^  f”  component 
Fig  2  An  arbitrary  Hertz  dipole 

can  be  expressed  as  the  follow'ing  integral: 


A  plane  conloining 
^the  Hertz  dipole 
ond  parallel  to 
'  z  axis 

v”  con^onent  !  \ 


V'  component 


-dv' 


P 


(2.2.1) 


Similar  to  (2.1.1 ),  (2.2. 1 )  can  be  expressed  as  the  following: 

J  +P’'  J  )R"di/'dv"  (2.2.2) 

vm  2*^  y'm  /»!2  tp'iu 


=-~\{P"  J  +P''  J  )R"du"dP'  (2.2.3) 

(fill  2"^  "i2l  v''in  mil  (p"m 

P  ,  P  ,  P  ,  are  the  functions  of  0,  P,  Z,  Z”.  R,  R”,  S”,,, 

;»ll  mil  mil  mil 

The  line  current  density  on  linear  antenna  should  be  expressed  as  volume  current  density,  so  that  the 
and  can  be  obtained  through  (2.2.2)  and  (2.2.3)  respectively. 

\m  {pm 

As  shown  in  Fig  2,  a  Hertz  dipole  \d\  is  expressed  as  following: 

(2.2.4) 


id/l^i  d^'P'-^i  difif^i  R"d(p"(p" 

4"  v>" 

-  !  dv"  v"+i  R"  d(p"  (p" 

!  c/v"=  /cosher  +cos"a  nil  (2.2.5) 

v"  "V  e'"  rf 

I  R" d<p"=  cosa  idl  (2.2.6) 

P  <p" 


cos  a 


(2.2.7) 


where  a^-.  a,f,  a^-  are  the  angles  between  the  Hertz  dipole  and  4”,r|’',(p’'  respectively. 


J  R''du"dv"^^ 


- A'(v"-V  )t/v" 

2k  q 


(2.2.8) 


J  R\hi"dv"=~ - S{P^-v)dv"  (2.2.9) 
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From  (2.2.4)  to  (2.2.9),  the  integral  equations  (2.2.2)  and  (2.2.3)  are  transferred  into  the  following 
equations  for  and  <7^’^  respectively. 


[P  /cos"  tz  +  cos"  or 

+  P" 

cos  a  )e  ' id! 

(2.2,10) 

4/r’' 

mii'y  r 

m\ 

2  (p" 

fj/’'-  = 

-T| 

( P  /cos"  or  +  cos*  or 

+  P" 

COSO'  )c~^”'‘id! 

(2.2.11) 

(fttl 

4jt  j 

ml\^  r  rf 

m'. 

n  ip" 

The  a 

,  a 

,  a  ,  are  function  of  the  antenna  length  /. 

2.3  E-Field  Computation  Formulas 

In  this  section,  the  E-fteld  of  an  aibitrarily  placed  Hertz  dipole  is  analyzed,  The  effect  of  the  antenna 
cuiTent  and  the  surface  cunent  on  the  body  can  be  regarded  as  that  of  many  Hertz  dipoles  with  different  site, 
orientation,  phase  and  strength.  The  whole  E-field  distribution  is  then  obtained  from  the  sum  of  these  dipoles’ 
near  fields. 

Suppose  a  z-direction  Hertz  dipole  lAl  is  at  source  point  (x',y',z').  as  shown  in  Fig.3,  and  the  field  point 
is  (x,y,z).  We  have  the  following  fonnulas  for  obtaining  the  E-field; 

r.  1  VV-.4  1  ,,2  1  vv/  /oam  A  Hertz  dipole 

‘  z-directa 

.4  =  f  (2,3  2)  ' 

I  Att  I 

I _ ^ 

R  =  ^|{x-  y  y  +  (y  -  y )"  +  (r  -  r' )-  (2.3.3)  ^ 

!■■  =- -  (2.3.4)  'x 


Here  A  has  only  z  component.  Then  we  have: 


1 

3fA 

A 

3- A 

A'  = 

/  /.  -  .f  -  I 

2  ^ 

z  C:  1 

f  A  .H  -  A 

J0JS/.1  - 

33 

33 ' 

Z24 

_  ///A/  cli 

.V  -  y  )/■' 

(2.3.6) 

ck 

4;r  3 

An 

2 

cA 

fiiM  a< 

V  -  y  )/'’^ 

(2.3,7) 

d> 

An:  3R  3 

4;r 

31 

_  fihM  ai 

_  P3/ 

z-z')P- 

(2.3,8) 

3 

An  a^  3 

An 

2 

F  = 

(/M+1) 

: - ^ ^ - e  ^ 

<kH 

(2.3.9) 

2  R' 

HP 

v’)  " 

ai  3 

33  An 

Fig  3  A  Hertz  dipole  and  the 
coordination 


-(.V -A'')(r -.-')/■' 


(2,3. lO) 
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Ak  cR  (t 


(2.3.11) 


A 


l^iM 


cl- 


dzdz  4;r 


at  &  I  A 


4;r  L  '12 


(2.3.12) 


/•■  - 


/?•' 


(2.3.13) 


Then,  the  E-field  at  point  (.x,  y,  z)  produced  by  the  z-direction  Hertz  dipole  can  be  given  as  following: 

/?  =  A'  .C-+A'  v  +  A'  f  (2.3.14) 


]■:  = 

z'y 

!■:  = 


/A  ^ 
30/A/ 


30/A/ f,  2 


/•■(>'-y)(o-r’) 


(2.3.15) 

(2.3.16) 


.ik  - 


k^r  +i'iz-z'y  +  /■' 


(2.3.17) 


For  the  saine  reason,  the  Hertz  dipole  at  point  (x'.y’.z')  being  along  x-direction  or  y-direction,  produces 
E-field  /■/  or  A'  at  point  (x,y,z)  as  following; 


f:  =  A  i  +  A  v+A  f 

v’  A-'.v  \’y  x'z 

30/A/ r,  2 


(2.3.18) 


Jk  L 


U"A'  +!■'  (x-x'y  +  /'■ 


(2.3.19) 


30/A/..^ 

R  =  — — -  /•  ( V  -  )( -v  -  .V  ) 


E  = 


Jk 

30/A/ 


(2.3.20) 


jk 


E{z-z  ){x-x') 


(2.3.21) 


A  =A  .v  +  A  v  +  A  5 


(2.3.22) 


(^v -.r' )(>'->’')  (2.3.23) 


/;  = 

v'v 

A  = 


jk 

30/A/ 


/A  L  1 
30/A/  , . 


a' A  +/•'{>■■- v')-  +  A 


(2.3.24) 


y{z-z'  )(v-.v’) 


(2.3.25) 


jk  ^ 

Regarding  an  Hertz  dipole  with  arbitrary  direction,  its  total  field  c/A  can  be  expressed  as  following. 
/A//  =  c  /A/.V  +  c  /A/v  +  c  lAlz  (2.3.26) 


where  cy,  c\.  and  cv  are  direction  cosines  of  / 
df:  =  ///<;  A-  +  ///•;  p  +  dR  f  (2.3.27) 
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30//\/  f 

^k'k'  +  /'A.y-  .r' )-  + 

dE 

jk  [x 

1  C  /'  (.V 

-  X  )(  V  -  v' )  +  c-  /-'  (.V  -  .y’  )(r  -  r’ )] 

^  (2.3.28) 

30/A/  f 

dE 

= - {  c 

k~  +  1‘  {  g  -  g' )“  +  /' 

jk  1 

L  1  t  ■  '  2j 

+  c  /•’  (  g 
\  1 

-  g'  )(.Y  -  .y'  )  +  c  /•■  (  g  -  g' )( J  -  r' ) 

Z  3 

j  (2.3.29) 

30/A/J  ^ 

\k-E  +E{:-='f  +e] 

dE 

jk  t'--| 

L  1  2  2j 

+  c  /■'  (r 

-  c' )( A-  -  a'  )  +  C  E  {z-  z'  )(  g  -  g'  )] 

>  (2.3.30) 

\  1 


The  E-field  over  the  body  of  revolution  is  composed  of  the  linear  antenna’s  field  and  the  body  surface 
current's  field.  For  the  linear  antenna,  given  its  geometry  parameter  and  current  distribution,  its  field  is  obtained 

from  the  integral  of  dli  along  antenna  length. 

R  -  \df:  (2.3.31) 

/  ,  ^ 

.  InU'fvui 

From  section  2. 1 .  the  body  surface  current  is  obtained  as  (2.3.33)  under  the  (u.v.tp)  coordinates.  For  the 
convenience  of  calculation,  they  should  be  transferred  into  (x,y,z)  coordinates,  as  shown  in  (2.3,32). 


i/\!c 

-  (CT 

cos  (p'  cos  69'— <T 

sincp'  )R'dv' dip' 

lAIc 

V 

=  (a 

sin  (p'  cos  0'-<j 

cos  ip'  )R'dv'd(p' 

//1/f 

=  -a 

sin  0'  R'dv'dip' 

<T  : 
<P 

f: 

Hon 


f  <T  e  ’""’'’' 

vtn 

(2.3.33) 

y  o- 

v>‘ 

From  the  integral  of  dE  over  the  body  surface,  the  field  contributed  by  the  surface  current  is  obtained. 
=  \\df:  (2.3.34) 

HOR 

Mirfacc 

The  total  E-fietd,  therefore,  is: 


+E  (2.3,35) 

/  ROR 

The  nonnal  component  of  the  E-field  on  the  conducting  body  of  revolution  is  most  concerned,  and  the 
computation  fonnula  is  in  (2.3.36). 

E  =  ( /'-  cos(p+/'.  sing?)sin0  + cos0  (2.3.36) 

normal  x  >•  z 
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Now,  we  have  obtained  all  the  fonnulas  about  E-field  distribution  computation.  In  next  section,  some 
exarnples  are  present. 

3  Computations 

The  E-field  distribution  of  a  monopole  on  infinite  plane,  as  shown  in  Fig.  4,  is  first  computed.  The  result 
is  compared  with  which  is  obtained  by  image  method.  The  monopole  is  X/4  high  and  with  the  current 
S\n{k{A./4  -  /))  on  it.  The  results  of  two  methods  are  listed  in  Table  I,  and  they  match  well  to  each  other.  This 
proves  our  method  right  and  suitable  for  E-ficld  distribution  computation. 


Fig  4  A  monopole  on  infinite  plane  Fig  5  A  monopole  on  conducting  sphere 

Anotlier  example  is  the  E-field  distribution  of  the  same  monopolc  on  a  conducting  sphere  whose  radius 
is  A./4.  The  monopolc’s  position  and  the  associated  coordination  are  shown  in  Fig  5.  In  Fig  6  and  Fig  7.  the  E- 
Field  normal  component  strength  distribution  along  generatrix  line  length  r  and  azimuth  angle  ip  are  plotted 
respectively.  Shown  in  Fig  8  is  a  3 -dimension  E-field  distribution  on  the  sphere  surface.  The  three  dimensions 
are  the  length  v  and  the  angle  <p  of  the  field  position,  and  the  E-field  nonnal  component  strength.  From  these 
figures,  it  is  cleai’  that  the  E-field  strength  decreases  rapidly  with  the  increasing  of  the  distance  from  the  antenna. 


Table  1  Results  comparison  of  image  method  and  the  method  in  this  article 


Field  Point 

Imaue  Method 

I  Method  in  This  Article 

(x.v-z) 

Ex 

E, 

E/ 

Ex 

Ev 

E. 

(0  .S.O  .O.S) 

-38  71429 

0.0 

38  77425 

-38.73227 

0.0 

38  78413 

+  i  15  63384 

+i0.0 

+j22. 92286 

+115.61806 

+j0  0 

-^122.9743 

(0.5,0., 0.25) 

-40.89494 

0.0 

40  87492 

-40.8786 

0,0 

1  40  89369 

ijl  1.29625 

+j0.0 

+171.29628 

+11 1  30152 

+100 

:  +171.3437 

(0  5,0,0.1) 

-19.6096 

00 

39.33406 

-19.5676 

00 

!  39.3349 

-j9. 390205 

+j0.0 

+194.72982 

-19.36906 

+10  0 

i  +194.76012 

(0.5,0,005) 

-10.05461 

00 

39.0087 

-5.871329 

0  0 

39.60166 

-15.122965 

+io.o 

+198.68033 

-16.735781 

+10.0 

+199.51534 

4  Conclusion 

We  have  discussed  a  method  to  obtain  the  lineai'  low  gain  antenna's  E-field  distribution  on  a  conducting 
body  of  revolution.  The  distribution  variation  trend  can  be  clearly  expressed  out  by  the  data  computed,  the  2- 
dimension  figures  and  the  3-dimension  figure.  While  we  anange  other  antennas,  we  can  choose  ainong  those 
sites  w'here  the  E-field  is  weak,  and  thus  reduce  the  probability  of  antenna  interference.  On  the  other  hand,  if 
antenna  is  in  the  area  with  strong  E-field.  EMI  may  exist,  and  we  can  change  the  antenna  to  other  site  where  the 
E-field  is  weaker. 
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An  Implementation  of  an  Exact  Scheme  for  Problem  Decomposition  Via  the  Use  of  Aperture 

Admittance 

Debra  L,  Wilkes,  Chung-Chi  Cha,  Thomas  Krauss 
Syracuse  Research  Corporation 
Merrill  Lane 
Syracuse,  NY  13210 

1.0  Introduction 

In  this  paper  an  approach  is  given  for  the  practical  solution  of  a  class  of  large  computational 
electromagnetics  problems.  While  the  application  of  a  numerical  method  such  as  the  Method  of 
Moments  (MOM)  to  a  large  problem  is  often  not  possible  due  to  limitations  in  computer  resources 
(speed  and  memory),  this  same  problem  can  often  be  decomposed  into  smaller  tractable  problems.  An 
excellent  example  of  a  target  that  is  well  suited  to  decomposition  techniques  is  an  aircraft  where  the  jet 
engine  cavities  are  naturally  separated  from  the  aircraft  exterior  by  the  aperture  formed  by  the  entrance 
to  the  inlet. 

This  decomposition  can  be  performed  without  the  loss  of  exactness  by  separating  an  original 
shape  into  an  exterior  surface  and  one  or  more  interior  regions.  Each  interior  region  is  coupled  to  the 
exterior  through  an  aperture.  The  solution  of  the  interior  problem(s)  and  the  exterior  problem  can  be 
performed  independently  as  long  as  provisions  are  made  to  correctly  pass  the  appropriate  coupling 
information  between  the  two  solutions. 

We  can  realize  a  computational  saving  by  decomposing  a  large  problem  into  two  separate 
smaller  problems.  In  addition  to  the  computational  savings,  this  problem  decomposition  allows  for  the 
convenient  use  of  a  hybrid  approach  where  different  techniques  can  be  applied  to  their  suitable 
respective  portions  of  the  target. 

2.0  Formulation 

The  class  of  problems  that  are  suitable  for  this  type  of  problem  decomposition  is  illustrated  in 
Figure  1  which  shows  a  generally  shaped  target  with  a  cavity-like  feature.  The  entire  problem  is 
decomposed  into  two  separate  problems:  the  interior  problem  and  the  exterior  problem.  The  sources 
are  assumed  to  exist  in  the  exterior  region.  The  interior  region  is  bounded  by  the  surface  S\  which  is 
the  cavity  wall,  and  a  surface  Sg  at  the  aperture.  The  exterior  problem  is  defined  by  the  surfaces  S2 
and  S^-  After  the  interior  problem  has  been  solved,  we  will  apply  the  Method  of  Moments  using 
subdomain  basis  functions  to  solve  the  exterior  problem. 
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Inferior  Problem 


Exterior  Problem 


Figure  1 .  Decomposition  of  Original  Problem  into  Two  Separate  Problems 

In  considering  only  the  interior  problem,  the  electric  and  magnetic  fields  cannot  be  determined.  This  is 
because  the  complete  problem  information  is  not  available:  the  incident  field  and  the  shape  of  the 
exterior  region  are  unknown.  However,  the  uniqueness  theorem  tells  us  that  the  knowledge  of  the 
tangential  electric  field  in  the  aperture,  Ea,  is  sufficient  to  determine  all  fields  within  the  cavity. 
Consequentially,  we  can  say  that  the  tangential  magnetic  field  in  the  aperture,  Ha,  can  be  expressed  as 
a  function  Ea-  This  relationship  can  be  expressed  by  using  a  generalized  admittance  of  the  aperture; 


nxH.  =:Y(EJ  (0 

where  n  is  the  outward  unit  normal  of  the  aperture  and  Y  is  the  linear  admittance  operator  that 
transforms  Ea  into  nxHa.  This  admittance  relates  two  "distributions."  But  when  Eg  and  nxHg  are 
represented  by  their  samples  at  discrete  locations  within  the  aperture,  then  Y  will  be  of  the  form  of  a 
dense  matrix.  It  should  be  noted  that  we  have  chosen  this  definition  for  Y  instead  of  representing  a 
relationship  between  Ea  and  Hg  .  This  was  done  to  parallel  the  formulation  used  for  an  Impedance 
Boundary  Condition  (IBC).  Thus,  if  Y  was  to  be  used  to  represent  an  IBC,  it  would  take  the  form  of 
a  diagonal  matrix  since  the  IBC  is  a  pointwise  relationship  between  the  two  fields. 

After  the  interior  problem  is  solved,  the  next  step  begins  with  the  hand-over  of  the  admittance 
matrix  to  the  exterior  problem.  We  note  that  the  interior  problem  may  solved  by  an  arbitrary  means  as 
long  as  is  can  be  used  to  describe  the  aperture  via  the  desired  admittance  matrix.  We  define  the 
aperture  characterization  as  consisting  of  the  following  information: 

1.  a  set  of  N  discrete  points,  (  Pj  |  i  =  ,  in  the  aperture.  The  way  in  which  they  lie  in  the 

aperture  is  illustrated  in  Figure  2,  which  shows  an  aperture  in  a  cylindrical  body. 
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2.  Two  unit  vectors,  (ui,vj),  at  each  point  in  the  aperture  which  are  both  tangent  to  the  aperture 
surface 

3.  The  aperture  admittance  matrix  [Y]  which  is  2Nx2N  in  size  which  corresponds  to  the 
operator  in  Equation  (1).  The  2N  samples  correspond  to  the  vector  components  alond  the  two 
specified  directions  at  the  N  specified  points.  Thus,  Equation  (1)  will  have  the  form. 


n  X  H,(P,)  u, 

Y.jrE.(P,)'U,' 

nxH.(P.)  V. 

YvviE.(P,)-v,. 

(2) 


Figure  2,  Distribution  of  Grid  Points  on  an  Aperture  with  a  Rectangular  Grid 

The  exterior  problem  is  solved  for  using  MOM  where  the  equivalent  electric  and  magnetic 
currents,  J  =  nxHg  and  M  =  Eaxn,  are  the  two  unknown  quantities.  This  solution  will  utilize  the 
admittance  matrix  to  reduce  the  equivalent  currents  in  the  aperture  to  one  independent  variable  The 
equivalent  currents  are  expanded  using  a  set  of  subdomain  basis  functions  (fj^  |  k==  1,...M}  on  the 
aperture  surface: 


J=  Sa^fk 


(3a) 


M=  ZPkfk 


(3b) 


An  illustration  of  an  aperture  divided  up  into  rectangular  subdomains  was  shown  in  Figure  2.  We  will 
select  J  to  be  the  independent  variable  and  M  to  be  the  dependent  variable.  We  can  write  Equation  (2) 
as 


[(w„J>]  =  W[(wj,nxM)] 
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where  wj  represents  both  the  u  and  v  tangent  vectors  depending  on  the  value  of  i.  Substituting 
Equation  (3)  into  (4),  we  get 

[(wj.f»)]“  =  [Y„][(w,,nxf^)]p  k  =  l„.M  (5) 

or 

[F]a  =  [Y]  [G]a 

This  equation  can  be  solved  for  £  using  the  method  of  least  squares  provided  that  2N  >  M.  Using  the 
superscript  H  to  denote  the  Hermetian  of  a  matrix,  we  find  that 

a  =  [  ( YG)H  ( YG)  ]-  ^  ( YG)H  F  a  (6) 

With  this  expression  that  expresses  the  coefficients  of  the  magnetic  equivalent  current  in  terms  of  the 
coefficients  of  the  electric  equivalent  current,  both  currents  can  be  described  by  the  coefficients  aj^. 
These  coefficients  can  be  solved  for  using  the  standard  MOM  formulation, 

3.0  Software  Implementation 

This  capability  has  been  implemented  in  a  general  purpose  Method  of  Moments  code  called 
PARAMOM  (Parametric  Method  of  Moments)  which  was  developed  by  Syracuse  Research 
Corporation.  Flere,  the  term  parametric  refers  to  the  fact  that  the  basis  functions  used  are  a 
modification  of  the  RWG  (Rao,  Wilton  and  Glisson)  triangular  patch  basis  functions  which  lie  on  a 
curved  parametric  surface. 

Briefly,  the  user  supplies  a  data  file  with  the  aperture  admittance  matrix  and  the  associated 
supporting  information  as  defined  in  the  previous  section.  A  portion  of  the  target  is  designated  as  the 
aperture  surface.  Thus  it  is  important  to  have  a  target  geometry  model  where  the  aperture  "hole"  was 
closed  off  with  a  surface.  There  are  a  few  issues  that  arise  in  the  implementation  of  this  scheme  into  a 
computer  code.  These  issues  will  be  briefly  discussed  in  the  following  paragraphs. 

Issue  1;  The  coordinate  systems  for  the  interior  and  exterior  problems  are  different.  This  problem  was 
treated  by  the  sufficient  but  somewhat  rudimentary  approach  of  allowing  the  user  to  specify  a  rotation 
and  translation  vector  which  transforms  the  aperture  (local)  coordinates  into  the  entire  target  (global) 
coordinates. 


Issue  2;  The  grid  points  may  not  lie  exactly  on  the  surface.  An  example  of  this  is  an  aperture  which  lies 
in  a  slightly  curved  surface  where  the  aperture  admittance  was  generated  assuming  a  planar  aperture 
face.  As  long  as  the  grid  points  do  not  lie  too  far  off  the  actual  surface  in  the  target,  this  can  be  dealt 
with  by  using  an  algorithm  that  minimizes  the  distance  between  the  grid  point  and  the  parametric 
surface  to  find  the  point  on  the  surface  that  is  closest  to  the  grid  point. 

Issue  3;  In  order  to  form  the  matrix  equation  in  Equation  5,  we  need  dense,  uniform  distribution  of  grid 
points.  There  should  be  at  least  one  grid  point  in  each  subdomain.  Since  both  grids  should  be 
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sufficiently  dense  to  accurately  represent  the  tangential  fields  at  a  particular  frequency,  this  should  not 
be  a  problem. 

4.0  Results  and  Conclusions 


In  this  section,  some  results  of  radar  cross  sections  (RCS)  of  a  dielectric  filled  rectangular  cavity  are 
given.  The  shape  is  an  open  perfectly  conducting  (PEC)  cube.  The  interior  problem  for  this  shape,  as 
illustrated  in  Figure  3  can  be  solved  for  using  a  modal  expansion  of  the  interior  fields  to  generate  the 
aperture  matrix. 


Figure  3.  Geometry  of  Open  Cavity 

In  the  following  figures,  the  monostatic  RCS  for  an  open  PEC  cube  (a=b=c=.909^)  is  shown  for  three 
cases:  1)  (air-filled)  and  00  polarization,  2)  er=2  and  00  polarization,  3)  Zy=2  and  ^  polarization. 
The  results  generated  using  the  aperture  admittance  are  compared  against  two  other  numerical 
methods.  In  Figure  4,  the  RCS  for  Case  1  is  shown  as  computed  by  the  aperture  admittance 
formulation  described  in  this  paper  and  two  other  methods.  One  is  to  simply  model  the  entire  open- 
faced  cube.  The  third  method  is  to  use  the  capability  of  the  ParaMOM  code  to  model  bulk  dielectrics 
using  a  surface  formulation.  The  open  box  is  filled  with  a  dielectric  region  having  the  properties  office 
space.  All  three  methods  produce  very  similar  results.  The  RCS  of  the  corresponding  closed  cube  is 
included  for  reference.  Figure  5  shows  the  RCS  for  Cases  2  and  3.  They  are  compared  against  the 
method  that  fills  the  cube  with  a  bulk  dielectric. 

In  conclusion,  we  have  shown  how  a  type  of  problem  decomposition  using  an  aperture 
formulation  can  be  developed  and  implemented  into  a  code  which  solves  the  exterior  problem.  This  has 
been  validated  using  a  simple  test  shape.  This  can  be  easily  extended  to  work  with  other  interior 
formulations  and  more  complicated  shapes. 
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Figure  4.  RCS  of  an  Open  Cube,  L  =  .909X,  00  Polarization 


Theta  Polarization 


a)  00  Polarization 


b)  (|)(|)  Polarization 

Figure  5.  RCS  of  an  Dielectric-Filled  Open  Cube,  ey=^2,  L  =  .909X 
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Abstract 

In  this  paper,  we  discuss  the  development  and  implementation  of  computational  electro¬ 
magnetics  techniques  on  a  massively  parallel  architecture,  Thinking  Machine  Corporation’s 
CM-5.  We  focus  on  advanced  numerical  formulations  and  parallel  implementation  for  electro¬ 
magnetic  scattering  problems.  The  goal  of  this  work  is  to  combine  state-of-the-art  numerical 
algorithms  and  massively  parallel  processing  to  increase  the  maximum  target  size  that  can 
realistically  be  solved  using  the  method  of  moments  (MoM)  technique. 

We  have  parallelized  ParaMoM,  a  MoM  program  which  utilizes  the  parametric  patch 
formulation  developed  at  the  Syracuse  Research  Corporation  (SRC).  ParaMoM-MPP,  the 
parallel  implementation  of  ParaMoM,  has  been  demonstrated  to  produce  good  performance 
and  accuracy. 


1  Introduction 

The  problem  we  are  addressing  is  the  prediction  of  the  radar  cross  section  (RCS)  ot  3D  arbi¬ 
trarily  shaped  electrically  large  targets.  We  report  results  obtained  by  combining  state-of-the-art 
CEM  techniques  and  state-of-the-art  massively  parallel  processing  technologies.  Practical  RCS 
prediction  using  numerical  methods  has  long  been  thought  of  as  unrealistic.  This  is  because  nu¬ 
merical  solutions,  while  exact  in  concept,  demanded  amounts  of  computation  and  memory  too 

'This  work  was  supported  by  the  Fdectromagnetic  Code  Consortium,  ARPA  and  the  Signature  Technology 
Office  of  Wright  Laboratory,  Air  force  Material  Command,  (ASC)  U.  S.  Air  Force,  Wright  Patteron  Air  Force 
Base,  Ohio  45433-5000. 
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large  to  accomplish  in  the  past.  The  RCS  prediction  of  a  fighter-sized  aircraft  using  MoM,  for  ex¬ 
ample,  requires  the  solution  of  a  matrix  equation  whose  dimension  can  easily  exceed  100,000.  The 
impossibility  of  such  computations  also  discouraged  efforts  to  improve  other  aspects  of  CEM  tech¬ 
niques.  Successful  developments  of  massively  parallel  processing  (MPP)  technologies  have  moved 
us  into  a  position  from  which  the  opportunity  now  looks  much  better  for  solving  the  above- 
mentioned  problem.  Parallel  computing  not  only  drastically  improves  speed,  it  also  prompts  new 
developments  in  CEM  techniques  by  bettering  the  prospects  of  real  problem  solutions. 

In  this  paper,  we  discuss  the  development  and  implementation  of  an  MoM  code  which  computes 
RCS  and  antenna  radiation  on  a  Massively  Parallel  Processing  (MPP)  architecture.  We  utilize 
the  parametric  patch  model  proposed  by  Wilkes  and  Cha  [2]. The  parametric  formulation  has  been 
implemented  by  Syracuse  Research  Corporation  (SRC)  in  a  program  called  ParaMoM.  ParaMoM 
utilizes  curvilinear  surface  patches  in  conjunction  with  a  new  type  of  basis  function  suited  to  the 
curved  domains.  A  brief  overview  of  the  parametric  formulation  is  presented  in  Section  2. 

The  ParaMoM  code  has  been  parallelized  on  various  MIMD  distributed  memory  systems 
including  Thinking  Machine  Corporation’s  CM-5  and  the  Intel  Paragon.  We  present  the  details 
of  the  implementation  on  the  CM-5,  in  Section  3. 

The  performance  and  scalability  of  ParaMoM-MPP  are  discussed  in  Section  4.  Results  for 
some  of  the  Electromagnetic  Code  Consortium  (EMCC)  benchmark  cases  are  shown  and  discussed 
in  Section  4.  The  results  obtained  by  ParaMoM-MPP  are  in  good  agreement  with  the  EMCC 
benchmark  measurement  data. 

We  present  our  conclusions  in  Section  5. 


2  An  Overview  of  the  ParaMoM  Formulation 

In  this  section  we  summarize  the  parametric  patch  formulation,  originally  proposed  by  Wilkes 
and  Cha  [2],  used  in  the  ParaMoM-MPP  code.  The  parametric  MoM  formulation  is  defined  for 
a  class  of  surfaces  which  can  be  represented  by  two  surface  parameters  (u  and  v).  Each  point 
on  the  target  surface  corresponds  to  a  (u,v)  coordinate  pair.  Most  popular  geometry  modeling 
programs  can  produce  such  a  parametric  surface  representation  (e.g.  NURBS  or  bicubic  spline 
representations). 

The  basis  functions  used  in  the  parametric  formulation  are  similar  to  the  well-known  Rao- 
Wilton-Glisson  (RWG)  [3]  functions  but  are  defined  in  the  parametric  space  of  the  surface.  The 
basis  function  domains  are  triangles  in  the  parametric  (u,v)  space.  The  potential  integrals  are 
performed  on  the  basis  function  domains  which  conform  to  the  curved  target  surface  rather  than 
the  usual  fiat  facet  approximation.  Therefore  errors  introduced  by  facetization  are  eliminated. 
The  basis  functions  have  all  the  desirable  properties  of  the  RWG  functions,  i.e.  artificial  charge 
accumulations  at  patch  boundaries  are  avoided.  When  the  patch  dimensions  become  small  com¬ 
pared  to  the  radius  of  curvature,  this  basis  function  approaches  the  RWG  basis  function  for  a  flat 
triangle.  This  basis  function  is  defined  in  terms  of  a  general  surface  parameterization,  and  it  is 
not  tied  to  any  specific  surface  parameterization.  This  feature  has  allowed  for  simple  modular 
inclusion  of  several  different  parameterizations  in  the  method  of  moment  procedure. 

The  ParaMoM-MPP  code  includes  a  number  of  capabilities  and  features.  E-field,  H-field, 
and  Combined  Field  formulations  are  supported.  An  IBC  formulation  is  included  in  the  CM-5 
version  for  use  with  material-coated  targets.  Parametric  geometries  are  accepted  in  the  standard 
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IGES  114  (bicubic)  and  IGES  128  (nurbs)  formats,  as  well  as  the  SRC-developed  GPE  format. 
Wire  antennas  may  be  attached  to  the  target  surface.  Gridding  and  material  specification  are 
performed  in  the  ParaMoM  geometry  preprocessor.  Up  to  three  symmetry  planes  may  be  utilized 
to  reduce  computational  and  memory  requirements  for  symmetric  targets. 


3  Parallel  Implementation 

The  ParaMoM  [4]  code  has  the  following  phases:  Setup,  Precomputation,  Matrix  fill,  Right 
hand  side  vector  fill,  Matrix  factorization,  Back-substitution  solution,  and  Ear-field  solution.  The 
setup  phase  reads  in  the  parameters  required  for  the  computation.  Precomputation  computes 
arrays  of  position  vectors  and  basis  functions  at  each  integral  point.  The  matrix  fill  phase  fills 
the  elements  of  the  moment  matrix  which  can  be  one  of  EFIE,  MFIE,  and  CFIE  decided  at  setup 
time.  The  right  hand  side  vector  fill  phase  computes  the  excitation  (right-hand  side  vector). 
The  matrix  factorization  phase  decomposes  the  moment  matrix  into  lower  and  upper  triangular 
matrices,  then  back  substitution  is  applied  to  solve  the  linear  system.  Finally  far-field  and  RCS 
are  calculated  at  the  far-field  solution  phase. 

The  computational  expenses  of  the  setup  phase,  precomputation  phase,  RHS  vector  fill  phase, 
and  scattered-field  computation  phase  are  of  order  N,  where  N  is  the  number  of  unknowns  (pro¬ 
portional  to  surface  area  of  target).  The  matrix  fill  phase  is  of  order  iV^  the  matrix  factorization 
is  of  order  and  the  matrix  solution  is  of  order  iVT  Because  our  primary  interest  in  applying 
parallel  computers  lies  in  reducing  the  total  time  required  to  solve  large  problems  (i.e.,  large  N), 
we  concentrate  our  energy  on  reducing  the  and  processes.  All  phases  of  the  program  were 
parallelized  in  this  effort. 

The  explicit  message-passing  approach  has  a  number  of  advantages  for  the  MoM  application. 
Performance  of  a  message-passing  program  is  less  dependent  on  the  interconnection  topology  of 
the  parallel  machine  than  it  is  in  data-parallel  and  shared-memory  approaches.  There  is  more 
direct  control  over  the  message  volume  and  timing.  Therefore,  message-passing  programs  are  more 
likely  to  port  efficiently  from  one  parallel  architecture  to  another.  In  addition,  the  functions  of 
the  message-passing  library  are  quite  similar  on  all  architectures;  differences  are  mainly  in  syntax 
rather  than  in  philosophy. 

The  CM-5  implementation  utilizes  the  software  supplied  by  Thinking  Machine  Corporation; 
the  CMMD  message  passing  library  and  the  CMSSL  matrix  solver.  The  CMSSL  library  uses 
a  data-parallel  programming  model  and  cannot  be  globally  interfaced  with  the  message-passing, 
matrix-filling  algorithm.  Therefore,  in  our  implementation,  the  matrix  fill  and  matrix  factorization 
and  solution  (factor/solve)  are  two  distinct  program  units.  A  high-speed  device,  such  as  the 
scalable  data  array  (SDA)  or  DataVault,  is  used  to  link  these  two  stages.  The  message- passing 
MoM  matrix-filling  program  fills  the  matrix  and  writes  it  to  a  file  in  the  format  required  for  the 
factor/solve  stage.  The  matrix  is  subsequently  read  in  by  the  data-parallel  matrix  solver  stage. 

There  is  very  little  performance  penalty  for  using  separate  program  units  for  the  filling  and 
factor/solve  because  the  DataVault  or  the  SDA  has  very  high  data  rates.  As  an  illustration,  the 
elapsed  time  for  writing  the  moment  matrix  to  a  SDA  file  as  measured  by  the  CMMD  timer  is 
shown  in  Table  1.  The  writing  operation  is  executed  by  the  CMMD  global  write  under  CMMD 
synchronous  sequential  mode.  One  can  see  that  the  extra  effort  in  using  a  high  performance 
storage  device  to  utilize  both  the  message-passing  paradigm  and  the  data-parallel  paradigm  is 
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Table  1;  The  elapsed  time  of  writing  the  moment  matrix  to  a  SDA  file  using  CMMD  global  write 
under  CMMD  synchronous  sequential  mode  (recorded  using  the  CMMD  timer). 


ivt 

32-node 

512-node 

write 

fill 

write 

fill 

3000 

1.19(s) 

1207(s) 

0.39(s) 

94.1(s) 

5000 

2.93(s) 

3295.2(s) 

0.78(s) 

237.9(s) 

10000 

1.47(s) 

901  (s) 

justified. 

There  are  a  number  of  methods  available  to  perform  the  solution  of  dense  linear  system  of 
equations.  In  specialized  problems,  iterative  approaches  can  be  quite  efficient  in  terms  of  memory 
and  computation  time.  However,  iterative  techniques  suffer  when  used  to  treat  many  right-hand 
side  vectors.  In  this  study,  it  is  of  interest  to  solve  systems  with  a  large  number  of  right-hand 
side  vectors  (i.e.,  scatterers  illuminated  by  multiple  incident  waves)  and  iterative  methods  were 
not  considered.  The  Gaussian  elimination  method  is  employed  under  these  conditions,  since  the 
computational  intensive  factorization  need  only  be  performed  once. 

In  addition  to  computational  cost  concerns,  it  is  necessary  to  consider  memory  requirements. 
On  the  class  of  computers  we  are  considering,  the  main  memory  (RAM)  is  distributed  among  the 
nodes.  The  moment  matrix  is  generally  much  too  large  to  fit  in  the  memory  of  one  node,  so  it 
must  be  distributed  equitably  among  the  nodes.  The  matrix  size  is  of  order  N\  but  all  other 
arrays  used  in  the  ParaMoM  code  are  of  order  N  or  lower. 

To  parallelize  the  ParaMoM  code,  we  apply  the  problem  partition  approach  which  has  received 
the  most  attention  in  scientific  computing  applications.  In  this  approach,  each  processor  executes 
substantially  the  same  program,  but  on  a  portion  of  the  problem  data.  Processors  are  loosely 
coupled  throughout  the  computation,  exchanging  information  whenever  necessary.  It  is  very 
suitable  for  solving  large  problems,  where  all  available  memory  is  required.  The  implementation 
difficulties  are  how  to  partition  the  problem  to  have  a  good  load  balance. 

The  task  of  filling  the  moment  matrix  is  distributed  among  the  nodes  in  the  system.  Each 
node  has  the  responsibility  of  computing  a  number  of  columns  of  the  matrix.  The  implementation 
of  the  setup  phase  utilizes  parallel  I/O  to  achieve  parallel  read.  The  implementation  of  the 
precomputation  phase  distributes  the  tasks  of  computing  arrays  between  nodes,  each  node  only 
compute  a  portion  of  the  array  then  all  nodes  gather  the  results  from  all  the  nodes.  The  matrix 
fill  implementation  is  illustrated  in  Figure  1.  The  RHS  vector  fill  implementation  distributes 
the  vectors  among  the  compute  nodes.  The  matrix  factorization  and  back  substitution  have 
different  implementations  on  the  different  architectures.  The  CM-5  implementation  uses  CMSSL 
subroutines.  The  far-field  and  RCS  computation  implementation  lets  each  node  compute  the  far- 
field  in  terms  of  its  local  solution  vector  obtained  by  solving  the  moment  matrix  equation,  then 
the  contributions  are  summed  and  put  in  one  node. 
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Parallel  Moment  Matrix  Fill  Algorithm 


1.  Loop  over  source  patches  from  one  to  TV/; 

2.  Loop  over  three  possible  edges  on  the  source  patch; 

(a)  Find  out  the  global  index  of  the  edge; 

(b)  Compute  the  index  of  the  node  in  whose  slab  includes  the  column  whose  number  is  the 
same  as  the  global  edge  index; 

(c)  Compute  the  local  index  for  the  location  of  the  column  in  the  node; 

(d)  The  selected  node  increases  its  job  counter  by  one  and  stores  the  information  of  the 
edge  (the  global  edge  index,  the  local  edge  index,  and  the  local  position  in  the  node); 

3.  Those  nodes  whose  job  counter  is  0  go  back  to  1; 

4.  Loop  over  field  patches  from  one  to  A/; 

(a)  Compute  all  the  integration  points  for  the  source-field  patch  pair  interaction  (Green’s 
function); 

(b)  Loop  over  source  basis  functions  for  that  node,  from  1  to  its  job  counter; 

•  Loop  over  all  possible  field  basis  functions  (or  edges  on  the  field  patch) 

•  Use  the  formula  derived  in  Chapter  2  to  compute  the  elements  of  the  moment 
matrix; 

•  Sum  all  the  contributions  from  the  four  pairs  of  patches  associated  with  each 
source-field  basis  function; 

End  of  the  source  patch  loop. 


Figure  1:  The  pseudo  code  of  the  parallel  fill  algorithm 

4  Numerical  Results 

In  this  section  we  present  the  performance  and  scalability  of  ParaMoM-MPP  and  some  nu¬ 
merical  results  computed  by  the  ParaMoM-MPP  code.  We  only  show  architectures  here.  Table  2 
shows  a  performance  comparison  of  ParaMoM  on  one  processor  SGI  versus  ParaMoM-MPP  on 
various  architectures. 

Tile  test  targets  used  here  were  modeled  using  ACAD  and  gridded  using  the  ParaMoM  ge¬ 
ometry  preprocessor.  All  numerical  results  are  compared  with  the  measurements  published  by 
the  EMCC  (see  references  [6,  7]).  For  the  results  shown,  the  patch  sampling  rates  used  for  the 
benchmark  comparison  runs  are  somewhat  conservative  and  would  be  reasonable  on  a  large  prob¬ 
lem.  In  mo.st  cases,  the  maximum  edge  length  specified  during  target  gridding  was  one-eighth  of 
a  wavelength. 
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Table  2;  Time  Comparisons 


SGI  Indigo 

R4000 

(estimated) 

Paragon 
(98  nodes) 

CM5 

(512  ndoes) 

SPl  (EST.) 
(58  nodes) 

Fill  Time(in  secs) 

57,000 

888 

901 

871 

speedup  rel.  to  SGI 

1 

64 

63 

65 

Factor  Time(secs) 

142,000 

358 

156 

speedup  rel.  to  SGI 

1 

397 

910 

Total  Time(secs) 

Fill  +  Factor  -f  I/O 

199,000 

1715 

1067 

speedup  rel.  to  SGI 

1 

116 

186 

Figure  2:  The  RCS  with  HH  polarization  of  a  wedge  cylinder  with  gap 

The  first  example  is  the  RCS  from  a  flat  plate  in  the  shape  of  a  wedge  cylinder  with  gap.  The 
incident  wave  has  a  frequency  of  5.9  GHz  with  horizontal  polarization.  The  receiver’s  polarization 
is  also  horizontal.  The  parametric  patch  model  of  the  target  has  the  maximum  edge  length  g,  and 
the  number  of  triangular  patches  is  1008  with  553  nodes  and  1560  edges.  The  radius  of  circular 
part  of  the  wedge  cylinder  is  A  and  the  length  of  its  straight  side  is  two  A. 

The  monostatic  radar  cross  section  for  horizontal  transmitter  and  receiver  polarizations  (HH) 
is  shown  in  Figure  2.  The  result  is  obtained  by  running  the  ParaMoM-MPP  on  a  80-node  partition 
Intel  NAS  paragon.  Six  Mbytes  of  memory  per  node  are  required  to  run  this  target.  There  are 
360  excitation  vectors  to  compute  the  rnonostatic  radar  cross  section.  The  computation  takes  309 
seconds  by  a  wall  clock.  The  RCS,  in  dB  A^  plotted  against  the  azimuthal  angle  is  shown  in 
Figure  2  in  a  10°-elevation  conical  cut. 

The  parametric  patch  model  for  the  “single  ogive”  target  with  three  symmetry  planes  are 
used  so  one  eighth  of  the  body  is  actually  modeled.  The  ogive  is  illuminated  by  an  incident  wave 
at  9  GHz.  The  single  ogive  is  gridded  into  two  different  models.  The  first  model  utilizes  three 
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Figure  3:  The  RCS  with  HH  polarization  of  a  “single  ogive' 


symmetry  planes,  so  only  one  eighth  of  the  single  ogive  needs  to  be  modeled.  This  one-eighth 
single  ogive  is  modeled  by  2088  curved  triangles  with  3208  edges  and  1121  nodes.  The  maximum 
edge  length  is  The  second  model  uses  no  symmetry  plane.  It  consists  of  about  5000  curved 
patches,  and  7332  edges. 

The  monostatic  radar  cross  section  characteristics  for  both  HH  and  VV  polarizations  are 
plotted  in  dBSM  as  functions  of  the  azimuthal  angle  in  Figures  3  and  4.  The  elevation  angle 
is  zero.  In  Figures  3  and  4,  and  computed-without.sym  denotes  the  model  uses  no  symmetry 
planes,  computed_with_sym  denotes  the  target  modeled  with  three  symmetry  planes.  Although 
tlic  number  of  the  curved  patches  of  the  model  without  symmetry  planes  is  much  less  dense  than 
that  with  symmetry  planes,  Figures  3  and  4  shows  good  agreement.  This  demonstrates  that  the 
parametric  patch  method  of  moments  with  relatively  fewer  unknowns  still  obtains  good  accuracy. 
There  are  360  RHS  vectors  for  this  problem.  This  problem  is  run  on  a  64-node  partition  of 
the  NAS  Paragon.  The  symmetry  model  requires  9  Mbytes  per  node  and  takes  9648  seconds  to 
complete  the  run.  The  model  without  symmetry  planes  requires  21  Mbytes  per  node  and  takes 
3918  seconds  to  complete  the  run. 

5  Conclusion 

In  this  paper  we  describe  the  parallelization  of  ParaMoM,  a  MoM  code  which  computes  the 
RCS  of  3-D  arbitrarily  shaped  conducting  bodies  utilizing  a  parametric  curved  patch  model. 

We  describe  the  development  and  implementation  of  ParaMoM-MPP,  the  parallel  version  of 
ParaMoM,  a  MoM  code  which  computes  the  RCS  of  3-D  arbitrarily  shaped  conducting  bodies 
utilizing  a  parametric  curved  patch  model.  The  performance  data  listed  in  Section  4  shows  good 
performance  on  three  platforms.  We  demonstrate  the  accuracy  of  the  code  through  comparison 
of  computed  data  with  published  EMCC  test  case  measurements. 


Figure  4:  The  RCS  with  VV  polarization  of  a  “single  ogive  ’ 
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Abst  ract 

In  this  paper,  wo  [)ro.sont  a  library  of  tool.s  for  parallelizing  monnnit  inothod  r.odos  written 
for  secpunitial  madiinos.  4’hese  codes  are  FlfHM,  IBC3D,  OARLOS-.'iD  and  S1C2I),  We 
created  the  tools  to  parallelize  the  fir.st  3D  MOM  code  and  u.scd  the  .sanu'  set  of  tools  to 
parallelize  other  MOM  code.s  with  little  clTort.  Our  procedure  includes: 

1.  An  adequate  understanding  of  the  original  corle, 

2.  Code  modification  to  conform  to  the  input  of  the  toolbox  library, 

.1.  Direct  application  of  the  subroutines  in  the  tool  box  library  and 
1.  Validation. 

1  Introduction 

Two  years  ago,  we  parallelized  a  3D  MOM  (Methods  of  Moiueint,.s  )  code,  FERM,  (Finite  Element 
Radiation  Modeling)  to  run  on  the  iPSC/860.  At  the  time,  we  did  not  anticipate  any  pait  of 
code  would  be  reused  for  another  project.  Shortly  afterwards,  we  needed  to  migrate  another  3D 
.MOM  code,  IBCdD,  from  the  VAX  to  the  iPSC/860.  We  found  that  we  could  reuse  a  lot  of  the 
subroutines  used  in  the  parallelization  of  FERM.  We  grouped  these  subroutines  into  a  "toolbox” 
called  BES1.4B  (Boeing  Electromagnetics  Subroutine  Library.)  Since  then,  we  have  used  BltSLlH 
to  iiiigralc  two  other  codes,  CARLOS-3D  and  SIG2D,  from  sequential  machines  to  the  ilTSC/SdO 
In  §2,  we  discu.ss  the  basic  parallelization  algorithm  for  MOM  coch's  In  §3,  wc'  ])rcsciit  the- 
usage  of  BESJjIB  ami  (,he  specifics  in  the  parallelization  of  each  of  (.lie  MOM  code's. 

IBC3D,  CARL0S-3D  and  SI02D  and  the  coiTespondiiig  validation  and  timing  results  of  each  of 
these  codes.  In  §4,  wc  jirescnt  the  plans  for  future  inpirovemenl  of  BESI4B, 
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2  MOM  Codes  for  MIMD  Machines 

!n  (.1k‘  i)aiMlleli/.ai.i()n  of  MOM  codecs,  \v(!  arc  coiicerncfi  wKh  t.hc  folic iv/iiis',  issiic.s: 

1,  Maximize'  llu'  iKirfortuancc  of  Uu'  linear  ocjualioii  solver, 

2.  Miiiirnr/e  m('ssa{;e  |)assiri{’:  am!  reclumlant  arithmetic  during  the  matrix  (n'ln'ra.lion  stage'  and 

d.  l.oad  halancing  during  the  niat.rix  generation  stage. 

dTie  factorization  of  tlie  impedance  matrix  is  the  most  costly  part  of  a  MOM  code  It  is  also 
a]>i)lication  ind('])endent.  Most  hardware  vendors  provide  out-of-core  matrix  etjuat.iori  sedvers  that 
are  tuned  for  oih.imal  performance  of  their  macliincs.  It  is  imirortant  that  the  solver  has  adequate 
comput.ing  resources  for  maximum  performance.  Since  tlie  solver  does  lujt  ma'd  the  geometric 
and  electric  data  of  the  physical  problem,  it  is  best  to  have  the  factorization  and  solution  of 
t.he  impedance  matrix  as  a  separate  program  module  so  that  the  computation  block  size  can 
ho  maximized,  do  address  problem  (1)  above,  we  separate  our  MOM  codes  int.o  three  program 
moduh's: 

•  Imiieduiice  niat.rix  equation  generation, 

•  ImiKuiaiicc  matrix  factorization-solution  and 

•  l\)s(;].iroeessiug. 

(I'ost  irrocossiiig  includes  RCS  calculation,  antenna  excitation  calculation  and  graphic  output  of 

the  eurrent.s.)  ^ 

VVe  use  the  l'roSolver^'^'^’^-DESjl3]  as  our  imiiedaiice  matrix  eciuatioii  soKa-r.  ihoSolver 
DlvS  requir(‘S  tlie  rnal.rix  to  be  stored  in  a  two-level  block  form  d’he  matrix  is  stored  as  a  21)  mesli 
of  disk  sc'ctions.  Each  disk  section  is  a  2D  mesh  of  node  sections.  VVe  generate  the  impedance 
matrix  in  ProSolvprf'^''''^-DES  format.  Often  the  units  of  computation  and  tiic  unknowns  are 
dilfereiit.  Imr  ('xample,  in  the  31)  codes,  if  one  uses  the  Rao  VViltoii-Glissoii  (RWC)  basis  function, 
t,he  units  of  conpnitation  are  the  triangle  patches  and  the  imkiiowiis  correspond  to  the  iiitciioi 
edges  of  tlie  triangulation  of  the  model.  We  studied  several  different  parallelization  schemes  of 
tliis  j)roblcmln|.  for  the  31)  codes  using  the  RWG  basis  function,  the  best  scheme  is  to  reoder 
the  patches  with  the  Reverse  Outliili  Mckee{RCM)  algorithm|4j  so  that  patches  which  share  the 
same  edges  are  numbered  t.ogethor.  'Fhen  tlie  interior  edges  are  reordered  aceording  to  the  new 
])atc,h  ordering.  If  processor  k  owns  the  submatrix  Z{i\  :  in,j\  '  Jn),  the  list  of  source  patches 
corrospondiiig  to  unknowns  to  j,,  Die  list  of  field  patches  corresponding  to  unknowns  ?,  to 

are  determined.  "Plieii  the  computation  loops  through  these  lists.  1  his  scheme  avoids  message 
l>assmg  and  miiiimizes  the  amount  of  redundant  arithmetic  tlius  providing  a  sat.isfact.ory  solution 
for  Problem  (2).  It,  also  puts  the  weighty  elements  close  to  the  diagonal  and  t.hus  improves  the 
condition  of  the  imiiedanee  matrix  . 

All  rolnist  MO.M  codes  perform  singularity  extractions  wlu'ii  the  source  and  iield  jiat.chcs  are 
dose  toget.her.  3'her<'f<M'e  t.he  computation  t.ime  for  t.he  iK'ar  diagonal  terms  can  be  30  times  as 
much  as  t.he  other  I.erms.  'I'o  ioatl  balance  the  matrix  gc'in'rat.ion ,  we  i-onipnt,e  ail  the  diagonal 
node  sections  in  parallel  bcfoia'  eoniput  ing  tlie  otlier  l('rm.'=. 
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hi  .smnitHU  y,  i>ur  ] 
(  h('  (mIIowhi':^  <isp('cl.s: 


jai;ill<'l  MOM  (lilTcrs  IVuin  tli(ni'  ci iiifs| > 


If.  sr((U<Miti;i,l  V('isKUis  in 


I  h  into  thivo  inodulos. 

It  US(’S  <1.11  R(  M  oi  (1('I  lilt',- 

ii  noiK'raios  I  ho  (liayoi!,il  nodo  ^^octions  iirst. 


3  Usage  of  BESLIB 

Usn..  niiSLIH  ..o  parall..!,.,.  a  MOM  •■...l.a  .m.-  n..<..l.s  prov.d,.  .Inv  uk-s: 

I  MATRIX,  whirl.  rrTirial.is  a  block  of  ihc  iniiicdaiicc  nial.ii.x', 

■;  RHS  ,  whicli  KOicralos  a  block  of  fhc  HIIS  corrcspoli.Unii,  to  ll.c  oxcital.o.,  vcclo.a, 

:!  MIH.TSCA.  which  Rcuciales  a  block  of  H.c  matrix  corrosi.o.id..,n  to  Ihc  scatter, or  vectors 

'1-1, c  s.,bseclio„s  of  ll„s  sect,™  descr.bc  tl,e  .spctlics  ,„  the  parallclixa,, . . .f  each  of  the  MOM 

eodcs,  IT-;RM,  1BC3D.  OAIUaS.:!l)  and  SIC/2D  The  vah.lat.o,,  for  the  n.iRratcd  cod.-s  iodudes 

1.  comparing  output,  wd.h  tlud  of  t.lm  original  codes, 

o,  comparing  wit.h  available  onalylical  results,  e.g.,  the  Mic  series. 

.'h  comparing  wit.h  measured  data. 

„|1  cases,  iho  parolk'li'/.ai.o,,  p,oce.ss  p,-escrves  the  i„teR,-,lv  of  the  or.gii.al  c.ales, 

3.1  Parallelizing  PERM 

PKHM  was  oriRioallv  wr.tlon  ,„  1987  by  l,ce  a,.d  Sch..,d,ua„  from  the  l,,.,c„h.  Laboratory  |81 
for  ll,p  VAX  It  admil.s  |>orfcctly  cohduclinE  ai,d  rcs.sfive  boiiodal'y  conditions  am  uses  , 
rR  W.lbH.  Oliss,™  basis  f„nct,o.,s|9|.  .Since  the  Intel  solvers  n.s<.  double  prec isior,  an  l.iriet.m 
we  converted  l.he  matlix  Rencration  into  .loiible  precision  aritluiKt.K',  «<■  parallc'lised  ‘  "■‘■y"' 

.  iPSCV860  ''hlie  first  jraiallelizatioii  |11|  was  l)ased  on  the  ]nede<,esso: 

\tm)  pi  callrHl  I  -OOCS  (Largo  Out-Of-Core  Solver)(r2|,  ItOCX-S  combined  the  matrix 

a  a  m  o  .U,  i,.e  subroutine  and  did  not  allow  the  user  the  flex.b i  .ty  to 

are  sT  c  illiL  stated  in  §2-  Therefore  the  performance  of  flie  solve,  the  appl.cat.oi,  is 
worse“l'.han  t.hat,  of  t.he  eertdieation  progran.  which  solves  a  mat.rix  of  - amhmi  mindrms.  n 
::::;:d  parallelmat.,on  is  based  on  ProSo,vern-)-,.)BS[  1  j.  .besides  t  e  so  vep 

contains  versatile  I/O  subrontnics  that  allow  the  user  to  optiinizf>  Ins  <odis.  lable  1  Oious  ^ 
inning  rc«sidts  of  FLHM  using  64  processors  with  16  Mlrytes  of  memorv  on  eae  i  pimessoi, 

the  t.argct.s  have  one  plane  of  symmetry,  r'-ichsvvttcnihii- 

Note  that  in  ea(-!i  of  the  above  cases,  twe,  systcins  (F  eiiuatain.s  aie  sf.l  ■ 

size  ai^jiroxirnately  miual  to  half  of  the  nundier  of  unknowns. 
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No.  of  1  .biknowiis 

I’dll  'Time(hrs) 

Tact.or  'Tiine(hrs) 

'Toi  ai  '['inic 

. '‘Th<sTb 

0.'244 

0.139 

0.421 

10,200 

0.37!) 

0.282 

0,724 

io,oso 

0.408 

0.208 

0.800 

13,408 

0.010 

0.003 

i.oor> 

21,000 

1.402 

1.380 

3.003 

'Fal)!*'  !:  ’I'iining  H.('sull.s  fur  f'lOHM 


3.2  Parallelizing  IBC3D 

1BC:3I.)  was  unginall}--  dcvclopetl  by  Falco  el,  a,l.  at,  Riverside  Kvseareh  Institute  fur  the  VAX[d|,  It 
admits  anisol,iopi(:  boundary  conditions.  It,  also  uses  tdie  RWd  basis  functions.  lB(d3D  was  very 
much  iik('  h'FRM.  In  the  parallcjlization  of  1BC3D,  we  retraced  most  of  the  steps  we  went  through 
in  the  paralleli'/at.ioii  of  PERM.  As  stated  in  the  introduction,  we  reused  a  lot  of  the  sultroutines 
used  in  t  he  parallelization  of  l-ltRM  to  ])arallelize  IBC3D.  We  grouped  these  subroutines  togetluM' 
into  a  ’toolbox’  called  BESIJB.  dable  2  shows  the  timing  results  of  a  series  of  perfectly  conducting 
spheres.  I1K:3!)  does  not  take  advantage  of  the  symmetry  planes  of  the  target, s. 


No.  of  i.fnknowns 

I'dll  'Time(hrs) 

Factor  Tinie(hrs) 

d'otal  Time 

5,000 

0,1 

0.15 

0.2 

10,000 

0.35 

0.8 

1.2 

15,000 

0.85 

2.2 

3.1 

25,000 

2.80 

6.8 

9.5 

Table  2:  'Timing  R.esult.s  for  IF3(’3D 


Isotropic  and  anisotro])ic  t  rc’at.nients  require  a  significantly  longer  time  to  generate  the  matrix 
as  .shown  in  T’igtire  1 . 

3.3  Parallelizing  CARLOS-3D 

CARLOS  31)  distributed  by  the  Electricmagnetic  Codi'  Consortium  was  developed  b}'  Butman  et 
al  [7]  at.  McDonnell  Douglas  Aerospace.  It  has  full  material  modeling  capability  arid  uses  the 
roof  Lo]>  basis  functions. 

The  CARL0S-3D  parallelization  wa.s  started  by  convert.ing  the  entire  code  to  double  precision 
arithmetic.  Next  th(!  matrix  general, ion  was  placed  in  (,h(^  form  required  by  subroutine  N4AIRIX 
m  BESldB.  This  was  fairly  straight  forward,  since  CARIX)S-3l)  was  already  set  up  to  do  block 
nils  of  (,he  matrix. 

The  jrarallelization  of  the  compul.ation  of  the  iiieideiit,  va-ctors  and  t  he  R.CS  was  slightly  more 
l.rieky.  CfAR.bOS-3D  was  originally  set  up  to  compute  one  right  hand  side  (RIIS),  then  solve 
for  the  corresponding  solution,  I'Oiiqnjt.i!  the  corresiaonding  scatt.ering  direction  and  the  Rf  S, 
tlion  conquit.o  anol,h(U'  R.flS,  eUc.  Bl'lSLlB  recjunx'S  all  t,hf’  RllS  t,o  be  computed  fiefoie  the  solce 
subroutine  in  Brobolver^'’’^’^- DRS  is  called.  BroSolv('r'''''^’bDE.S  also  eomput.es  the  solution  of  all 
l.he  RllS  simuit.aneouly.  We  rearra.ng:ed  f.lu'  coni|)Ulal  ion  in  tlie  original  (’ARDC)S-3D  to  generati' 
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Matrix  (irncralioii  I  inu's 
l}v  Malt'rial  1  ri'alriu'fit'; 


Ai)ist>tro[)if 


lsotr<i|)i<- 


1:  I'lniiiiK  Hf>sult,s  for  t.ho  Matrix  ( itciicrat  ion  of  a  Ha,nr’<'  Isotropic  ami  A  iiisotroiac 
'IVoatniont. 

(.ho  IlliS  and  l.ho  srattoriiiA  vectors  in  a  Idock  format  as  rcupurf'd  hv  siiljioutines  (IlOrNHllS  and 

Mllld’SCA. 

The  performance  of  (dAHIiOS  30  is  sHght.ly  worse  than  h'ERM  and  Ihd.’dD  bccansr'  the  node 
section  size  used  is  260  ^  260  instead  of  tlie  optimal  node  section  size  used  m  I'ERM  and  HR '31), 
wiiich  is  330  x  330. 

'i'a.ble  3  below  shows  a  few  timing  results  for  various  job  sizes  ami  cube  sizes. 


.bkc  of  Nodes 

No  of  Unknowns 

Fill  1'ime{hrs) 

Facd.or  Tinie(hrs) 

Total  Id  me 

4 

2,832 

0.99 

1  0.28 

1 .36 

16 

5,712 

0.1  I 

0.02 

0.13 

64 

16,200 

6.67 

3.43 

11,21 

Rabh’  3:  Timing  Result.s  for  (.lARbOS  3|) 


3.4  Parallelizing  SIG2D 

SK.TI)  was  originally  devodoped  by  the  Science  Application  Internal.ional  (.’orporatioii  (S  A  i ( t)|  1  0| . 
li.  is  a  21)  momenl.  method  code  using  roof  top  basis  functions.  We  started  by  eonvert  ing  t  he  codr' 
t.o  ilouble  jireeision  arithmetic.  Our  ver.sion  of  the  ProSolver^  )IAS  does  not  handle-  syirinu't 
rie  matnec’s.  Idiert'hu'C  we  exeduded  tdie  symmetry  coiisiderat.ion  in  the  original  code.  In  l.lm 
inipiajanee  matiix  generat  ion  of  the  original  code,  the  inner-most.  Iooj>  /u-nerates  Hi  elements  t.lmi 
are  not.  in  conseeutive  st.orage,  W(>  rest. met, iired  the  iooj)  st.rnel.nrr'  t.o  avoid  mint'eessai  v  eoni|)ii 
tat  ion.  We  also  soak'd  the  rows  of  the  imptnlarn'e  matrix  with  one  ol  the  HlySIdHj  subi  oiit.i  ncs 
'Hible  4  below  shows  Hie  timing  results  of  a  ft'W  t.est  eases. 
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(jf  l.Inkuowms 

I'dll  'rnnefbi'^l 

lOhcttir  'rinie(lirs) 

Total  'rillR' 

r.Sfis’ 

0.038 

0.06-1 

0.127 

b,b92 

0..b80 

0.803 

i  (i20 

lO.Ttfi 

1 .92(1 

2.33<s 

1.867  j 

12,. 3 -12 

2.1.31 

3. -13-1 

(i.375  1 

13.9.3-1 

2,928 

■1.716 

8,3.12  ’ 

'I'aijlo  -1;  'L'iinitip;  ilcsults  for  S1(I‘21) 


4  Conclusion  and  Future  Irnproveinent 

We  have  deiiKinst,, atcd  that,  a  U.olhex  for  parallel'/.atioii  of  inomeiil,  iucI.IkxI  <-mles  is  useful  ami 
conserves  rime  in  the  parallehsation  of  MOM  eorles,  lioaever  HESLIB  siill 
j)rovernont.  The  following’:  <.asks  would  enhance  tdie  usefulness  and  (dhnency  of  BhrSlJB. 

1.  Simplify  an<l  streamline  t,he  ])ropianiruer  interface. 

2.  Mak('  t  he  code  even  more  mo<Jular. 

;k  Adtl  a  pf'neral  out-of-core  malrix  multiply  routine. 

A.  Add  Haps  to  turn  .scalinp  and  HCM  ordering  on  c.r  off. 

We  also  want,  to  miplernent  another  load  balancing  scheme  for  the  matrix  general, ion.  In  this 
scheme,  eaeh  processor  looks  at  a  token  to  determine  the  next  node  section  to  ho  created  and 
the  processor  ran  create  any  node  s(>ci,ion,  not  just  tlie  ones  it  will  be  associated  with  during  the 

solve. 
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Abstract 

This  paper  presents  the  application  of  the  Finite-Difference  Time-Domain  (FDTD)  method 
to  microwave  circuits.  The  FDTD  method  is  extended  to  model  the  interaction  of  three- 
terminal  active  devices  and  electromagnetic  fields  by  replacing  the  active  device  with  equiv¬ 
alent  current  sources.  The  implementation  of  the  current  sources  is  investigated  to  show 
the  effect  on  the  results.  A  typical  microwave  amplifier  is  also  simulated  and  the  results  are 
compared  with  measured  data  with  good  agreement. 


1  Introduction 

With  the  advent  of  powerful  computing  machines,  the  Finite-Difference  Time-Domain  (FDTD) 
method  has  been  a  popular  tool  to  analyze  electromagnetic  phenomena.  The  FDTD  method 
provides  accurate  full  wave  three-dimensional  simulation  of  Maxwell’s  equations.  Moreover, 
all  the  transient  behavior  of  wave  propagation  can  be  observed  during  the  simulation. 

Reports  have  shown  the  FDTD  method  being  applied  to  the  analysis  of  microwave 
circuits.  The  frequency-dependent  characteristics  of  microstrip  circuits  are  studied,  and  the 
scattering  parameters  are  calculated  from  the  observed  time  response  [1,  2].  Recently,  it  is 
extended  to  include  lumped  devices  and  active  devices  such  that  microwave  active  circuits 
are  simulated  with  full  wave  analysis  [3].  In  [4],  a  Gunn  diode  in  an  active  antenna  is  modeled 
as  an  equivalent  active  region  and  incorporated  into  the  FDTD  algorithm.  In  [5],  the  FDTD 
method  is  combined  with  SPICE  to  analyze  an  amplifier.  The  active  device  is  the  core 
kernel  of  the  active  circuit.  The  interaction  between  the  active  device  and  electromagnetic 
field  has  important  effect  on  the  characteristics  of  the  entire  circuit.  In  this  paper,  the 
FDTD  method  is  extended  to  model  the  interaction  of  three-terminal  active  devices  and 
electromagnetic  fields.  It  is  also  applied  to  analyze  a  microwave  amplifier. 
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2  Equivalent  Current  Sources  of  the  Active  Device 


Fig.  1  shows  the  layout  of  a  typical  amplifier  to  analyze  in  this  paper.  The  entire  system 
contains  distributed  passive  structures,  lumped  passive  devices,  and  a  GaAs  MESFET.  The 
distributed  passive  structures  and  the  lumped  passive  devices  could  be  simulated  with  FDTD 
method  as  in  [1,  2,  3].  Besides,  the  analysis  of  this  circuit  lies  on  the  modeling  of  the 
interaction  between  the  three-terminal  active  device  and  electromagnetic  fields. 

The  small  signal  feature  of  an  active  device  is  fully  represented  by  its  S  parameters,  or 
its  lumped  equivalent  circuit.  If  the  voltage-current  relationships  at  the  input  ports  of  the 
device  agree  with  the  S  parameters,  the  interaction  of  the  active  device  and  electromagnetic 
fields  is  accounted  for.  Thus,  the  active  device  can  be  substituted  with  equivalent  current 
sources  if  the  voltage-current  relationships  are  maintained.  Actually,  these  current  sources 
stand  for  the  current  flowing  into  the  active  device.  Because  of  the  physical  current  flow 
direction,  these  current  sources  are  placed  in  the  longitudinal  direction.  Consequently,  the 
placement  of  the  equivalent  current  sources  is  illustrated  in  Fig.  2.  One  end  of  the  sources 
is  connected  with  the  microstrips  at  the  device  gate  and  the  drain  ports.  The  other  end  is 
connected  to  the  ground  plane  using  vias,  which  provide  a  voltage  reference  as  well  as  the 
modeling  of  the  physical  vias  from  source  pads  to  the  ground.  Since  the  device  region  of  this 
packaged  MESFET  covers  many  FDTD  cells,  all  the  FDTD  cells  at  the  gate  and  the  drain 
ports  are  placed  with  the  sources.  Thus,  the  signal  gain  and  the  signal  loss  due  to  the  active 
device  are  described  by  the  equivalent  current  sources. 


3  Extended  FDTD  method 


The  implementation  of  the  extended  FDTD  lies  on  the  integral  form  of  the  Maxwell’s  equa¬ 
tions.  ,A.s  time  evolves,  the  H  field  at  half  time  scale  is  determined  by  Faraday’s  voltage 
law.  which  is  the  same  as  the  conventional  FDTD.  Then  Ampere’s  current  law  is  applied  to 
determine  E  field  at  integer  time  scale  as  in  [5], 

OE 

V  X  H  =  -f  Joevice  (1) 

where  J Device  repieseuts  the  equivalent  current  source  of  the  active  device.  This  term  J Device 
is  determined  from  the  device  lumped  equivalent  circuit  such  that  one  extra  step  is  necessary 
to  evaluate  tlie  E  field  in  the  cells  of  the  active  device  region. 

By  integrating  (1)  over  the  FDTD  cell,  it  follows  the  integral  form, 


^Total  —  ^  dl  ^Device 


(2) 


where  ItokiI  is  the  total  current  flowing  through  the  FDTD  cell,  loevice  is  the  current  con¬ 
tributed  from  the  active  device,  V  is  the  voltage  drop  across  the  FDTD  cell,  and  C  represents 
the  space  equivalent  capacitance  of  the  FDTD  cell.  If  the  active  device  occupies  many  FDTD 


719 


cells,  the  integration  should  include  all  the  FDTD  cells  at  the  gate  and  the  drain  ports,  and 
the  terms  iTotai  and  C  should  include  the  contribution  from  all  the  cells.  The  voltage  in  (2) 
is  calculated  with  its  equivalent  circuit,  as  shown  in  Fig.  3.  The  equivalent  lumped  circuit 
of  the  active  device  is  coupled  with  the  circuit  model  of  the  FDTD  cells  in  the  active  region. 
Applying  circuit  theory,  the  state  variable  method  is  used  to  solve  the  voltage  V  at  integer 
time  scale  and  the  value  is  then  fed  back  into  FDTD  algorithm. 


4  Results 


The  active  device  used  in  the  amplifier  is  a  packaged  GaAs  MESFET,  NEC72084.  which  is 
bia.sed  at  Vp  =  3  V  and  Ips  =  30  inA.  The  amplifier  is  designed  to  have  9  dB  gain  at  6  GHz. 
The  eciuivalent  lumped  circuit  of  the  MESFET  is  depicted  in  Fig.  4.  The  element  values  are 
o])timized  to  match  the  measured  device  S  parameters.  The  simulations  are  performed  with 
uniform  grids.  For  the  radial  stubs  in  the  DC  bias  circuit,  the  staircase  approximation  is 
used. 

Fig.  5  shows  the  results  of  the  calculated  S  parameters  and  the  trends  of  the  variation 
as  reducing  the  number  of  the  equivalent  current  sources.  Since  the  equivalent  current  sources 
represent  the  signal  reflection  and  the  transmission  characteristics  of  the  active  device,  they 
should  distribute  over  all  the  cells  across  the  microstrip  line  to  avoid  additional  current 
mismatching  of  the  microstrip  line  and  the  equivalent  sources  at  the  connection.  In  Fig.  5, 
the  total  equivalent  currents  remain  the  same  for  each  case,  yet  the  frequency  of  the  matching 
point  shifts  from  5.7  GHz  to  5.19  GHz  as  the  number  of  the  sources  decreases.  This  shift  is 
due  to  the  current  mismatching. 

The  comparisons  of  simulation  results  with  measured  data  are  shown  in  Fig.  6.  The 
gain  at  6  GHz  and  the  matching  point  of  FDTD  simulation  are  9.3  dB  and  5.7  GHz,  respec¬ 
tively.  and  those  of  measured  results  are  9.23  dB  and  5.G  GHz.  The  simulations  have  good 
agreement  with  measured  data.  Above  all,  those  out-of-band  dips  near  1  GHz  and  11  GHz 
in  file  measured  curves  arc  also  predicted  in  the  FDTD  simulations. 


5  Conclusion 

The  FDTD  method  is  extended  to  provide  a  full  vector  electromagnetic  analysis  of  wave 
interaction  with  microwave  circuits.  It  is  critical  that  the  equivalent  current  sources  should 
Ije  implemented  to  agree  with  the  physical  current  situations.  This  equivalent  current  source 
ap])roach  is  used  to  analyze  a  typical  microwave  amplifier.  The  results  show  good  agree¬ 
ment  with  measured  data.  Generally,  this  approach  could  be  applied  to  nonlinear  circuits, 
and  performs  accurate  electromagnetic  field  simulations  of  microwave  and  millimeter  wave 
circuits. 
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Figure  1:  The  layout  of  the  microwave  amplifier. 
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Vias  Equivalent  Current  Source 


Ground  Plane 


Figure  2:  The  equivalent  current  sources  substituting  for  the  active  device  are  placed  at  the 
gate  and  drain  ports,  with  one  end  connected  to  the  microstrip  line  and  the  other  end  to  the 
ground  plane  using  vias. 


Figure  3:  The  equivalent  lumped  circuit  of  the  integral  form  of  Ampere’s  law. 


Lgg  Lg  Rg  Cgd  Rj  td  Ldd 


Figure  4:  The  small  signal  equivalent  circuits  of  the  packaged  GaAs  MESFET,  NEC72084. 
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Figure  5;  The  calculated  S  parameters  of  FDTD  simulations.  The  figures  shows  the  trends 
as  the  number  of  equivalent  current  sources  decrease,  but  with  the  same  total  current.  The 
number  of  the  sources  at  the  gate  and  the  drain  port  are  shown  in  the  legend. 
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Figure  6;  The  results  of  FDTD  simulations  and  the  measurement.  The  simulation  have  good 
agreement  with  the  measurement. 
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I.  INTRODUCTION 

The  Finite-Difference  Time-Domain  (FDTD)  method  has  been  used  with  great  success  to 
model  various  electromagnetics  configurations,  ranging  from  electromagnetic  scattering  (RCS) 
to  driven  antenna  problems.  The  method  has  been  extended  over  the  years  to  allow  frequency 
dispersive  materials,  surface  impedance  boundary  conditions,  sub-cell  models  of  electrically  thin 
material,  etc.  The  quality  of  the  absorbing  boundary  conditions  used  to  truncate  the  computa¬ 
tional  domain  has  also  continued  to  improve. 

In  this  presentation,  we  will  discuss  the  adaptation  of  a  large  part  of  the  FDTD  body  of 
knowledge  to  the  problem  of  acoustics.  At  a  basic  level,  there  are  a  number  of  parallels  between 
Maxwell’s  equations  and  the  coupled  acoustic  equations  derived  from  the  equations  of  continuity 
and  momentum  (assuming  negligible  viscosity  and  small  perturbations  from  rest).  It  is  natural  to 
adapt  the  FDTD  models  which  have  found  such  great  success  in  EM  to  the  modeling  of  a  variety 
of  acoustics  problems.  This  paper  will  present  the  many  issues  involved  in  the  development  of  such 
an  acoustic  model  including:  discretization  of  the  pressure  and  velocity  equation  (using  Yee-like 
cells),  derivation  of  the  grid  dispersion  and  stability  relations,  adaptation  of  absorbing  boundary 
conditions,  modeling  of  lossy  walls,  excitation  of  the  model,  etc.  The  well-known  acoustics  problem 
of  the  open-ended  pipe  was  used  to  validate  the  model.  The  comparison  of  the  numerical  results 
and  the  analytical  solution  was  excellent.  As  time  permits,  the  application  of  the  developed 
methodology  to  a  variety  of  acoustics  problems  will  also  be  presented. 

This  adaption  of  a  popular  EM  modeling  technique  to  another  field,  acoustics,  is  a  prime  exam¬ 
ple  of  the  new  multi-disciplinary  nature  of  scientific  research.  This  presentation  will  demonstrate 
how  EM  researchers  can  extend  their  techniques  to  a  large  variety  of  problems. 

II.  DISCRETIZATION  OF  ACOUSTIC  EQUATIONS 

If  one  assumes  small  perturbations  from  rest  and  negligible  viscocity  then  the  acoustic  equa¬ 
tions  reduce  to  a  form  that  is  very  similar  to  Maxwell’s  equations,^ 

Acoustic  equations 

=  0) 

^  =  -6oC^  V  (2) 

ot 

’This  small  perturbation  from  rest  and  negligible  viscocity  assumption  is  in  general  valid  for  sound  levels  below 
110  dB  [1].  For  reference,  speech  is  from  +30  to  +80  dB,  120  dB  is  discomfort,  and  140  dB  is  the  pain  threshold  [2]. 


Maxwell’s  equations 
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Figure  1:  (a)  Yee  cell  for  Electromagnetics,  (b)  Yee-likecell  for  Acoustics.  Both  cells  are  displayed 
for  the  rectangular  coordinates  case. 

where  u  is  the  gas  particle  velocity,  comprised  of  velocity  components  in  each  coordinate  direction, 
p  is  the  deviation  from  ambient  pressure,  c  is  tbe  speed  of  sound  in  the  medium,  and  6^  >s  the 
density  of  the  gas  at  rest.  [3].  In  rectangular  coordinates,  these  coupled  acoustic  equations  can  be 
expanded  to  yield 

du  ap.  dv.  dp.  ,  dp  r  ^,^1  (31 

dx  dy^  dz  dt  [dx  dy  dz  \ 

In  FDTD  modeling  of  Maxwell’s  equations,  the  solution  space  is  discretized  using  the  Yee  cell, 
see  Figure  1(a).  The  vector  components  of  the  electric  and  magnetic  field  are  distributed  around 
the  unit-cell  to  allow  the  differential  operators  to  be  approximated  by  centered  finite  differences 
which  yields  a  second  order  accurate  formulation.  In  a  similar  fashion,  we  distribute  the  scalar 
pressure  and  3  vector  components  of  the  velocity  around  an  acoustic  \ee-like  unit-cell,  see  Figure 
1(b).  In  rectangular  coordinates,  the  discretized  update  equations  thus  become 

.-.5,j,k,n  +  .S  _  ,  (4) 


ptj.fc.n+l  _  pi,},k,n 
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where  the  notation  p(j,y,r,0  =  p{lAxJ^y,kAz,nAt)  =  is  utilized. 

These  equations  are  updated  in  time  using  a  leap-frog  scheme.  First,  the  us  at  time  level 
n  +  .5  are  computed  from  p’s  at  time  level  n  and  previous  u's  at  time  level  n  -  .5.  Then,  the  p’s 
at  time  level  n  +  1  are  computed  from  u’s  at  time  level  n  +  .5  and  previous  p’s  at  time  level  n. 
This  process  repeats  until  the  temporal  simulation  is  complete. 


II.l  GRID  DISPERSION 

With  any  discretized  solution  to  wave  type  equations,  one  should  be  concerned  about  grid 
dispersion.  Grid  dispersion  is  the  fact  that  waves  in  the  numerical  grid  travel  at  speeds  slightly 
different  than  in  real  space.  Futhermore,  this  grid  dispersion  tends  to  be  frequency  and  angle  of 
propagation  dependent.  Therefore,  it  is  difficult  to  calibrate  for,  and  the  best  strategy  is  to  try 
to  minimize  the  amount  of  grid  dispersion. 

To  solve  for  the  dispersion  relation,  we  assume  a  general  plane  wave  is  propagating  throughout 
the  grid,  i.e., 

p,u„u„u,  (8) 

where  ky,  and  k,  are  the  x,  y,  and  z  components  of  the  numerical  grid  k.  Using  (8)  to  eliminate 
half  of  the  field  components  in  (4)-(7)  yields 
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Then,  substituting  (9)-(ll)  into  (12)  yields  after  some  algebraic  manipulation 


sin^(u.’A0  — 


In  general  is  not  zero  and  thus 


pi.j.*."  =  0. 


sin^(cxjA<)  = 


kj,Ax\  ( cAi 
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sin  — — 
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This  is  the  dispersion  relation  for  the  discretized  solution;  it  is  identical  to  the  dispersion  relation 
for  the  FDTD  technique  applied  to  Maxwell’s  equations.  For  a  uniform  grid  {As=Ax=/^y-Az), 
it  is  commonly  re-written  in  the  form 


where  is  the  wavelength  in  real  space,  is  the  numerical  wavelength,  and  cos  a,  cos/5,  cos 7 
are  the  direction  cosines  of  the  propagating  plane  w'ave.  Then  the  phase  error  per  cell  is 


II.2  STABILITY 

Another  important  criteria  is  the  stability  of  the  algorithm.  The  stability  relation  is  derived 
in  a  similiar  fashion.  Again,  we  assume  a  general  plane  wave  is  propagating  throughout  the  grid, 


in  a  simi 
i.e.. 


but  now  we  have  not  constrained  ourselves  only  to  time-harmonic  plane  waves.  Thus,  we  have 
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Again,  substituting  (18)-{20)  into  (21)  yields  after  some  manipulation 

-  2A^  +  1  =  0 

where 

^  _  2  tMiU  2  f  Sin’  -  2 
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III.  ADAPTATION  OF  PML  FOR  ACOUSTICS 

The  most  important  recent  advance  in  FDTD  modeling  is  the  introduction  of  the  perfectly 
matched  layer  (PML)  absorbing  boundary  condition  [4].  Basically,  the  PML  ABC  is  a  region  of 
fictitious  lossy  material  that  surrounds  the  computational  space.  This  fictitious  material  has  the 
desired  property  that  waves  at  any  angle  and  any  frequency  are  not  reflected  upon  entering  the 
lossy  region.  Then  the  loss  is  adjusted  to  sufficiently  attenuate  the  energy  that  entered  the  PML 
region.  This  material  is  called  fictitious  because  fields  within  the  PML  layer  are  artificially  split 
and  obey  a  special  set  of  equations  that  can  not  be  cast  in  the  form  of  the  usual  equations  with 
some  effective  material  properties. 

The  special  equations,  cast  for  use  in  acoustics,  are 
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where  the  scalar  pressure  p  has  been  broken  into  3  fictitious  components  pi-,  py,  Pz  loss  terms 
cr*,  (7*,  cr‘  have  been  introduced. 

For  example,  if  this  material  is  placed  at  the  positive  x  end  of  the  solution  space,  then  set 
cr^=CTl^  CTy=CT’=0,  cr^=a;=0.  The  wave  will  not  be  reflected  upon  encountering  the  PML  region. 
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t/At  t/At 

Figure  2:  (a)  Comparison  of  Ez(37,37)  for  PML  and  Mur  1st  order  type  ABC,  (b)  Comparison  of 
total  energy  error  in  the  solution  space. 

and  the  wave  will  be  attenuated  as  it  propagates  in  the  PML  region.  A  similiar  treatment  is 
placed  on  the  other  faces  of  the  solution  space.  The  edges  and  corners  how'ever  require  some 
special  attention,  and  the  reader  is  refered  to  [4]  for  the  details. 

In  order  to  verify  the  accuracy  of  the  PML  ABC,  consider  a  simple  2D  problem  in  which  the 
computational  space  is  defined  to  be  50  by  50  cells,  with  the  pressure  in  the  middle  set  equal  to 
a  gaussian  pulse.  Figure  2(a)  compares  the  exact  results  (solid  line)  computed  using  a  500  by 
500  box,  the  PML-surrounded  box  results  (solid  dots),  and  the  simple  Mur  1st  order  type  ABC 
(Dashed  line).  Clearly,  the  PML-surrounded  box  solution  and  the  exact  solution  are  in  good 
agreement,  while  the  simple  ABC  is  in  error.  The  total  energy  error  in  the  computational  grid  is 
compared  in  Figure  2(b).  The  PML  results  (solid  line)  are  nearly  10  orders  of  magnitude  better 
than  the  simple  Mur  1st  order  type  ABC  (dashed  line). 

IV.  VERIFICATION 

The  traditional  method  of  verifying  the  accuracy  of  a  numerical  model  is  to  select  a  problem 
with  a  well-known  theoretical  solution  and  to  compare  the  results  of  using  the  numerical  model 
to  the  theoretically  expected  results.  For  this  paper,  a  three-dimensional  unflanged,  open-ended 
circular  cylinder  was  modelled  using  the  previously  described  acoustics  formulation  of  FDTD.  The 
model  was  excited  with  a  Gaussian  pulse.  The  reflection  coefficient  was  calculated  from  the  nu¬ 
merical  solution  of  the  propagation  of  the  Gaussian  pulse  in  the  cylinder  over  time  and  compared 
to  the  theoretically  derived  reflection  coefficent  [5].  The  comparison  of  the  experimentally  calcu¬ 
lated  and  the  theoretically  expected  values  j/?|  and  Ija  versus  ka  (where  k  is  the  wave  number, 
27r/A,  and  a  is  the  radius  of  the  cylinder)  are  shown  in  Figure  3.  As  can  be  seen,  the  agreement 
between  the  FDTD  and  the  theoretical  results  is  excellent.  The  accuracy  of  this  finite-difference 
time-domain  model  is  therefore  verified. 
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Figure  3:  Comparison  of  \R\  and  //a,  finite-difference  time-domain  solution  and  theoretical  solu¬ 
tion. 

V.  APPLICATION  TO  SPEECH  MODELING 

One  important  application  of  this  new  acoustics  FDTD  modelling  technique  is  the  problem 
of  modelling  speech  production.  Most  current  methods  of  modelling  speech  production  involve 
either  directly  analyzing  the  acoustic  speech  waveform,  w’hich  assumes  a  simplified  linear  model, 
or  solving  the  acoustic  wave  equation  in  a  series  of  cylindrical  tubes,  which  assumes  a  simplified 
geometry.  Although  these  classes  of  models  have  been  successful  for  speech  coding,  recognition, 
synthesis,  and  analysis  for  many  years,  recent  advances  have  proceeded  at  a  very  slow  pace.  Many 
researchers  believe  that  the  simplified  models  of  speech  production  have  been  pushed  as  far  as  they 
will  go,  and  that,  in  order  to  continue  to  advance  the  field  of  digital  speech  processing,  improved 
models’  of  speech  production  must  be  developed.  This  FDTD  model  of  acoustics  can  provide  a 
mechanism  for  accurately  modeling  acoustic  wave  propagation  in  the  speech  production  process. 

We  are  currently  applying  the  FDTD  model  of  acoustics  discussed  above  to  the  problem  of 
accurately  modeling  speech  production.  In  this  new  model,  acoustic  wave  propagation  is  solved 
for  directly,  in  a  geometry  that  accurately  represents  the  vocal  tract. 

We  believe  that  the  FDTD  approach  has  several  distinct  advantages  over  both  traditional 
speech  modeling  techniques  and  finite  element  and  boundary  element  models.  First,  a  FDTD 
solution  provides  full  knowledge  of  the  acoustic  flow  at  every  point  in  the  vocal  tract  for  every  time. 
Second,  the  finite-difference  time-domain  method  makes  no  assumption  of  linearity  betw^een  the 
input  and  the  output  (note  that  this  is  not  true  for  frequency-domain  finite-difference  methods). 
Third,  a  FDTD  model  allows  many  parameters  such  as  the  geometry  of  the  vocal  tract  and  the 
excitation  source  to  be  altered  easily.  Finally,  finite-different  solutions  are  highly  parallelizable, 
allowing  for  the  efficient  use  of  computing  resources. 

Several  experiments  have  been  performed  using  FDTD  models  of  the  vocal  tract  during  vow-el 
production.  This  has  been  accomplished  in  two  dimensions  using  published  X-ray  data.  Recently, 
MRI  data  of  the  vocal  tract  during  vowel  production  has  been  obtained  and  is  being  used  to 
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develop  a  three-dimeosional  model  of  speech  production.  Results  from  these  experiments  will  be 
presented  at  the  conference,  as  time  permits. 


VI.  CONCLUSIONS 


This  paper  has  presented  an  adaptation  of  a  propular  EM  modeling  technique,  FDTD,  to  the 
field  of  acoustics.  The  parallels  between  the  EM  and  acoustic  coupled  equations  were  discusse  , 
and  the  discretization  of  a  three-dimensional  acoustic  cell  and  of  the  coupled  acoustic  equaUons 
were  presented.  The  dispersion  relation  and  stability  criteria  were  derived  for  the  acoustic  FD  1 IJ 
model  and  were  shown  to  be  parallel  to  the  FDTD  formulation  for  EM.  T^e  perfect  y  matched 
layer  fPML)  ABC  that  has  recently  been  introduced  for  EM  was  also  adapted  to  this  lUiU 
acoustic  model,  and  it  was  shown  that  the  PML  solution  is  10  orders  of  magnitude  better  than  a 
simple  Mur  1st  order  type  ABC.  The  accuracy  of  the  acoustic  FDTD  model  was  verified  by  model¬ 
ing  a  well-known  acoustics  problem,  the  unflanged,  open-ended  circular  cylinder,  and  comparing 
the  numerical  and  analytic  solutions  for  the  magnitude  and  phase  of  the  reflection  coefficien 
Finally,  the  application  of  this  FDTD  acoustic  model  to  an  important  problem,  modeling  speech 
production,  was  discussed  briefly. 
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I.  Introduction 

The  geometrical  modeling  flexihility  of  the  finite-difference  lime-domaiii  (I'D  IT))  method  [I]  make,s  it  a 
very  powerful  simiilalion  met  hodology  for  analyzing  and  designing  antenna  slnicture.s  for  modern  defense 
and  coiiiiuercial  related  |)rodurts.  7\n  example  of  such  an  arena  where  the  FDTD  technique  can  be  used 
to  great  advantage  involves  the  design  of  antennas  for  transceiver  handset.s  used  in  personal  wireless 
comninnication.s  networks.  New  devices  deniand  a.ntenna  .structures  wliicli  can  he  efficiently  and  conve¬ 
niently  integrated  with  small,  portabh'  hand-held  units.  Fignn'  1  illustrates  two  such  handset /antenna 
configurations:  a  back-mounted  |)lanar  inverted  !■'  antenna  (PIKA)  designed  for  cellular  applications  and 
a  circularly  polarized  patch  antejina  for  use  in  satellite  communications.  The  FDTD  algorithm  allows 
these  anteinia.s  to  be  modeled  in  their  true  operating  environment  with  the  hasidset  chassis  and  [)lastic 
casing  and  even  the  operator^  biological  tissue  included  in  the  simulation  [2.  3]. 

The  fact  that  the  FDTD  methodology  allow^s  tTiodeling  of  the  human  tissue  becomes  ])articniarly 
important  when  considering  the  performance  requirements  dictated  by  modern  systems.  For  example, 
most  hand-held  transceivers  are  designed  to  maximize  transmission  efficiency  in  order  to  consc’rve  power. 
However,  high  mismatch  losses  induced  by  the  antenna/lissue  coupling  as  well  as  gain  loss  due  to  ab- 
sorj)1if)u  in  the  tissues  can  reduce  (he  transmission  efficiency.  Additionally,  for  .systems  when'  a  handset 
must  fink  to  a  low-earth  orbit  satellite,  the  antennas  must  provide  circularly  polarized  radiation  wil.h 
a  reasonably  isotropic  [)a1,(ern.  Because  the  presence  of  the  head  can  significantly  aUu'r  tin*  radiation 
pattern,  it  is  essential  to  include  its  effect  in  the  antenna  analysis. 

The  goal  of  this  paper  is  to  illustrate  the  us('  of  the  FDTD  methodology  in  conjunction  w'ith  detailed 
models  of  human  tissue  and  handset-mounted  aniennas  in  predicting  the  effect  of  the  human  body  on  the 
performance  of  small  antennas.  The  stmly  will  address  the  computational  issues  involved  in  using  the 
FDTD  technique  for  this  application,  such  as  computational  time  and  storage  requirements,  transient 
response  duration  of  high-permittivity,  !ow-Q  dielectrics,  and  consideraf ions  for  dispersive  dielectrics. 
Simulation  results  will  be  provided  for  representative  handset/aritennaconfigurations  designed  for  cellular 
and  satellite  communication  networks. 

II.  FDTD  Modeling 

In  [>erforming  the  FDTD  simulations  of  handset-mounted  antennas,  care  tnust  be  exercised  in  mofleling 
the  antenna,  handset,  and  biological  tissue.  In  determining  appropriate  models,  it  is  imi)ortant  to  consider 
not  oidy  (heir  geometrical  accuracy  bu(  also  the  compiita( ional  implications  associated  with  their  use. 
In  (Ids  section,  the  models  for  the  antennas,  handsets,  and  iiiological  tissue  are  discussed.  .Additionally, 
(he  computational  s(orag('and  tinu'  rt'quirements  are  presented. 
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Figure  1:  CIcomotry  for  (a)  the  back-moiiiitefl  PIFy\  on  the  handset,  (h)  the  circularly  polarized  i)a.tcli 
antenna  on  tlie  liandsel,  and  (c)  cross-sectional  view  of  the  patch  with  dimensions. 


A.  Antenna/Handset  Models 

The  basic  modeling  of  the  auteiina  and  handset  i.s  accomplished  by  enforcing  appropriate  boundary 
conditions  in  the  FDTD  method  on  the  conducting  surfaces.  Wires  whicli  may  form  part  of  the  antenna 
itself  or  exist  in  the  antenna  feeding  network  are  modeled  by  using  a  special  wire  subcell  jnethod  which 
accounts  for  the  finite  wire  diameter  [1].  'Phis  practice  is  particularly  important  for  obtaining  accurate 
input  impedance  and  gain  values  for  the  radiators  [2].  If  a  plastic  casing  is  used  which  surrounds 
the  hand-held  unit,  it  can  be  modeled  by  assigning  the  appropriate  permittivity  values  to  the  cells 
immediately  adjacent  to  the  handset,  conductor.  In  this  paper,  the  plastic  casing  is  a  dielectric  with  a 
relative  permittivity  of  c,-  2. 

One  interesting  aspect  involved  with  the  FDTD  simulation  of  antenna  systems  is  the  choice  of  exci¬ 
tation  nuxiels  used.  Typically,  a  gap- volt  age  model  is  used  in  which  the  electric  field  is  specified  at  the 
antenna  feed  point  (.5].  In  this  work,  the  coaxial  feed  model  illustrated  in  Figure  2(a)  is  used  [2],  With 
this  model,  the  radii  of  the  coaxial  inner  and  outer  conductors,  i\,  and  Ti,  are  clio.sen  to  represent  the 
proper  input  impedance  of  the  actual  antenna  feed  line.  Special  difference  eqtiations  are  t  hen  used  at  the 
interface  betw'een  the  coax  and  the  antenna  to  account  for  the  knowm  field  behavior  within  the  coaxial 
line  as  well  as  the  difference  between  the  radius  and  the  FDTD  discretization  size  [2].  Numerical 
simulations  ixu’foinied  by  the  authors  ha.ve  demonstrated  that  the  two  a[)proaches  tyi)ically  provide  the 
same  results.  However,  the  advantage  of  the  coaxial  feed  model  is  that  it  can  reduce  the  amount  of 
computational  time  required  to  obtain  the  antenna  transient  response.  This  is  illustrated  in  Figure  2(b) 
wdiich  plots  the  normalized  current  versus  the  normalized  time  and  cy  are  the  free-space  impedance 
and  speed  of  liglit  respectively)  for  a  8. .5  cm  long  wire  ( ?•„  =  O.IG  mm)  monopole  centered  on  a  l.J  cm 
X  Id  cm  ground  plane  cxcit.ed  using  the  gaj)  and  coaxial  feed  models.  .4  320  ps  wide  Claussian  shaped 
voltage  pulse  has  been  used  as  the  excitalic.)n  for  both  plots.  As  can  be  seen,  use  of  the  coaxial  feed 
model  noticeably  reduces  the  duration  of  the  transient  response. 


B.  Tissue  Models 

In  order  to  obtain  insight  into  the  performance  of  antennas  near  a  human  operator,  considerable  care 
has  been  taken  to  develop  tissue  models  which  closely  resemble  true  human  anatomy  [3].  T'hese  inho- 
mogeiKHUis  models  are  formed  w'it  hin  the  h  D  FD  framework  by  assigning  the  apirroijiuate  jrermiltivity 
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Delta  Gap 


(a)  (b) 

I'iguro  2:  (a.)  ( a’oss-serlional  view  of  the  simulal.od  coax,  (b)  Normalizcfl  tormina!  ant<'ninia  cnrroiif 
versus  time  for  the  gap  voltage  and  simuiaiod  coaxial  source  models. 


and  conductivity  to  dilforent  cells  in  tlio  grid.  The  human  hand  is  modelerl  as  a  b.hf)  mm  layer  of  bone 
completely  surrounded  by  a  nim  layer  of  muscle,  as  imi)lled  in  Figure  Tire  human  head  model,  a 
cross  section  of  which  a])])ears  in  Figure  4(a).  consists  of  the  tissues  listt'd  in  Table  1.  The  permittivity, 
conductivity,  and  density  of  tties<'  tissues  at  DOO  MHz  and  2  (411/  have  bi'eii  obtained  from  ()ublislied 
data  [()].  Figuix'  1(b)  illustrates  the  fidl  lioad.  hand,  and  han<ls(’t  model  which  is  used  in  the  FDTl) 
simulatif)ns. 

One  important  issue  rf’laling  to  the  lime-domain  modeling  of  higli-jK'nnit I ivity  di('lectrics  involves 
the  slow  rate  at  which  energy  is  flissi])ated  from  (he  system.  In  order  tfj  demonstrate  this  irhenotnenon. 
consitler  a  dij)ole  antenna  placed  near  a  homogeneous  sj)here  of  dielectric,  as  shown  in  I  h<'  inset  of 
h’igure  5.  A  420  ps  Gaussian  voltage  imlse  is  introduced  at  the  antenna  terminals  and  the  input  current 
is  monitored  as  a  function  of  time  in  FOgure  5.  The  first  plot,  in  which  the  <Helectric  sphere  has  a  relative 
permittivity  of  c,.  =  1,  corresi)oiids  to  (he  dipole  in  freo-spaco.  In  the  second  frame,  the  S[)here  has 
c,.  =  74,  corresponding  to  the  iH’rmittivity  of  the  humour  in  the  eye  at  915  MIlz.  As  can  Ire  st'en. 
the  Introduction  of  the  high-permittivity  dielectric  noticeably  extends  t  lie  transient  response  duration. 
However,  addition  of  tlie  tissue  losses  reduces  tlie  antenna  transient  response  duration,  as  shown  in  tlie 
tiiird  frame  of  Figure  5.  'I’lie  final  frame  illustrates  tlie  response  wlien  the  dipoie  is  j)laced  next  to  tlie  full 
inltoniogeneoiis  head  modc'l,  vvlicro  again  it  is  seen  tliat  tlie  losses  noticealily  iz'diice  tlie  ringing  duration. 

A  si'cond  issue  involved  in  tin'  modeling  of  liiological  tissue  involves  the  dispersive  nature  of  the 
tissue  electrical  para.meters.  .Vn  accurate  way  to  arcommodate  such  media  in  a  time-domain  simulat  Ion 
apjiroaeh  is  to  use  a  metliodology  wldch  takes  into  account  tiie  dispersive  nature  [7].  'hiiis  is  particularly 
important  for  systems  in  whirli  tlie  widf'-l)and  response  is  desireci.  For  this  work,  liciwever.  tlie  bandwidth 
of  inl.er<'s(  is  relatively  narrow,  'riiereforc,  in  order  to  maintain  com[)iit at ional  siinplirity  and  speed,  it 
will  iie  assumed  tliat  tlie  constitutive  parameters  are  constant  over  tlie  liandwidtli. 
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Figuro  5:  Nonnalizod  terminal  aTiteiiiia  current  versus  time  for  the  dipole  near  three  difTeront  dielectric 
s])heros  and  near  t  he  human  head  model. 

C.  Computational  Requirements 

Depending  upon  the  physical  quantity  under  investigation,  two  different  comi)utationa,l  grids  are  used. 
For  example,  for  input  impedance  calculations,  nuTiierical  convergence  tests  have  proven  thal  a  coll  size 
of  ;F28  mm  is  adequate  (for  antennas  oi)erating  around  9J5  MHz).  For  pattern  computations,  a.  fj.rjG  mm 
coll  size  appears  to  give  converged  results.  The  computer  platform  used  in  the  simulations  is  an  IHM 
RIS('/G0()0  530II  workstation.  For  the  model  with  the  larger  cells,  JSOO  lime  steps  are  typically  used, 
resulting  in  overall  requirements  of  2-.'^  hours  of  execution  time  and  17  Mil  of  storage.  For  the  model 
with  the  smaller  cells,  3000  time  steps  are  used,  requiring  10-12  hours  of  run  time  and  .'iO  MB  of  storage, 
for  computations  at  2  (IHz,  the  models  must  he  scaled  api)ropriately  to  acccnint  for  the  n'duction  in 
wavelength. 

III.  Representative  Results 

The  comi)Utational  capabilities  develoi)ed  based  on  the  FDTD  metliodology  are  of  great  utility  in  de¬ 
termining  the  effect  of  the  operator  tissue  on  the  antenna  performance.  In  this  section,  computational 
examples  are  provided  to  illustrate  the  use  of  the  tochnitiue  in  determining  the  performance  of  the  two 
handsets  depicted  in  f’igure  !. 

A.  Back-Mounted  Planar  Inverted  F  Antenna 

As  a  lirst  example,  consider  the  back-mounted  planar  inverted  F  antenna  (ITF.A)  shown  in  Figure  l(a.). 
Tliis  handset /atitenna.  structure  is  designed  for  use  a.t  915  MHz  in  a  cellular  or  other  similar  lerrestrial- 
hasful  communications  system.  The  antenna  geometry  is  chosen  to  minimize  antenna  size  for  t  he  desired 
resonant  frefpiency.  Furthermore,  because'  the  plastic  case  which  surrounds  the  handset  W'ill  influence' 
the  antenna  resonance,  the  antenna  dimensions  have  been  chosen  such  that  the  best  match  occurs  at  915 
MHz  with  the  casiiig  present. 

An  examination  of  the  oj)erator's  influence  on  the  antenna  imj)edance  behavior  appears  in  I'igiire  (5. 
la  tins  plot,  the  magnitude  of  (assuming  a  50Q  feeding  coax)  is  plotted  as  a  function  of  [recpieiicy  for 
three  different  handset/tissue  conligurations.  The  first  curve  represents  the  Ix'havior  when  no  operator 
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Figure  G;  C’onipu1<Hl  |,SVi|  versus  frequency  for  the  back-mouiiied  FfFA  on  tlie  plastic  covered  handset 
with  no  tissue  present,  with  the  top  of  tlie  hand  located  1.26  cm  ironi  the  handset  top.  and  with  the 
hand  and  the  head  located  6.56  nun  from  the  handset  front. 

is  present.  In  the  second  curve,  the  liand  is  included  and  is  placed  such  that  its  top  is  d.2G  cm  below  the 
to])  of  the  handset.  In  the  third  curve,  the  hand  maintains  its  position  and  the  head  is  placed  6.56  mm 
from  the  edge  of  the  handset.  As  ran  be  seen,  tlie  influence  of  the  tissue  can  noticeably  alter  the  match 
between  the  antenna  and  feed  line.  Sucli  effects  must  be  fully  investigated  in  a  given  design  to  ensure 
that  excessive  mismatch  losses  will  not  occur  as  the  user  moves  his  hand  along  the  handset. 

'fhe  presence  of  the  tissue  can  also  significantly  change  the  antenna  gain  pattern.  This  is  demonstrated 
in  Figure  7  which  pre.sents  the  gain  pattern  magnitude  in  the  jirincipal  planes  both  with  and  without 
the  tissue.  In  this  computation,  tlie  handset  it  hold  upright  witli  respect  to  the  head  although  it  is 
possible  to  investigate  other  handset  orientations  [3].  As  can  be  seen,  the  tissue  influences  the  pattern 
shape,  polarization,  and  gain.  One  noticeable  feature  is  that  the  hand  and  head  absorb  18%  of  the  power 
delivered  to  the  antenna. 

B.  Circularly-Polarized  Patch  Antenna 

Figure  1(b)  illustrates  the  geometry  for  a  transceiver  handset  with  a  dual-probe  fed  patch  antenna.  The 
l>atch  construction,  detailed  more  carefully  in  Figure  1(c),  consists  of  a  3.75  mm  thick  conductor-clad 
dielectric  ((,.  =  *2.51)  which  ‘‘caps'’  the  conducting  handset  chassis,  d'his  design  allows  reduction  of  the 
antenna  lateral  dimensions  for  a  given  operating  frequency.  By  exciting  the  two  probes  in  phase  (piadra- 
ture,  this  antenna  can  be  made  to  provide  circular  polarization.  While  the  configuration  of  figure  1(b) 
is  to  be  investigated  in  this  paper,  other  embodiments  of  this  antenna  are  also  being  considered. 

Figures  8  and  9  compare  the  radiation  iierformanco  of  the  patch  antenna  when  the  handset  is  isolated 
and  when  it  is  held  3.75  mm  from  the  operator’s  head.  Figure  8(a)  illustrates  the  magnitude  of  the 
radiation  pattern  in  the  iirincipal  coordinate  planes  at  2  GIIz  for  this  antenna  when  no  tissue  is  included 
in  the  computation.  As  can  be  seen,  the  magnitudes  of  the  0  and  d  polarizations  are  approximately  equal 
for  0  <  60°.  Figure  8(b)  compares  the  phase  difference  =  arg{  /%}  -arg{i,V}  when  the  tissue  is  ab.sent 
and  [iresent.  Here,  it  is  apparent  that  the  phase  difference  is  near  ±90°  over  a  broad  range  of  angles 
when  no  tissue  is  included.  However,  when  the  operator  inlluence  is  accounted  for  in  the  computation, 
the  phase  difference  deviates  noticeably  from  the  desired  ±90°  value.  This  deviation  will  result  in  a 
degradation  of  the  circularly  polarized  radiation. 

'rids  degradation  of  the  circular  polarization  is  further  aggravated  somewhat  by  the  influence  of  the 
tissue  on  the  anlenna  magnitude  patterns.  This  is  illustrated  in  Figure  9  which  presents  the  patterns 
in  the  two  princijial  i)lancs.  As  can  be  seen,  the  magnitudes  of  the  two  i)ola.rizations  deviate  noticeably 
away  from  the  ^  —  0°  direction,  jiarticularly  in  the  x-z  plane.  It  is  also  uotcwvorthy  that  5.1%  of  the 
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power  (ielivored  to  the  aiitoiiiia  is  absorbed  by  tlie  tissues.  This  loss  coupled  with  the  change  in  the 
pattern  sliape  result  in  a.  gain  reduction  at  t?  =  0°  from  2.11  dll  to  3.3  dll.  Note  that  if  rlesired,  the 
data  presented  here  could  ho  plotted  in  terms  of  left-hand  and  right-hand  circularly  polarized  radiation 
modes. 


Figure  9:  Hadiation  patterns  for  the  geometry  in  Figure  1(a)  with  the  head  present  as  in  Figure  4(1)). 


IV.  Conclusions 

This  ])aper  has  demonstrated  the  use  of  the  FDTD  .simulation  methodology  to  investigate  the  performance 
of  handset -mounted  antennas  operating  near  a  human.  Computational  Issues  such  as  computational  time 
and  storage  reciuirements,  transient  response  duration  of  high-permittivity  dielectrics,  and  considerations 
for  dispersive  dielectrics  have  been  discussed.  Representative  e.xamples  have  lunui  presented  to  illustrate 
the  influence  of  biological  tissue  on  antennas  lor  cellular  and  satellite  communication  systems. 
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Introduction 

Ground-penetrating  radar  (GPR)  is  a  common  technique  in  the  detection  of  buried  objects.  The 
effectiveness  of  any  GPR  system  is  strongly  affected  by  the  antenna  design,  soil  parameters,  and  target 
characteristics.  In  the  past,  optimal  GPR  designs  required  extensive  measurements  and  design 
iterations.  Modeling  offers  the  promise  of  finding  quality  designs  before  the  expensive  construction  of 
prototypes. 

We  are  using  the  finite-difference  time-domain  (FDTD)  method  to  model  GPR.  Of  key  importance  is 
FDTD's  capability  of  modeling  the  fields  inside  a  penetrable  medium,  namely  the  ground.  There  are 
several  advantages  of  this  method  for  GPR  modeling.  First,  the  general  and  flexible  nature  of  the 
FDTD  method  allows  us  to  model  complex  antennas,  complex  targets,  and  possible  soil  inhomogeneities 
in  a  straightforward  manner.  Second,  the  method  can  model  the  soil,  which  is  a  dispersive  (and 
dissipative)  dielectric.  We  are  modeling  the  dispersive  soil  as  a  Debye  medium,  using  the  recursive 
FDTD  formulation  [1].  A  third  advantage  of  the  FDTD  method  is  that  we  can  find  the  broadband 
response  of  a  system  by  using  pulsed  excitation,  that  is,  find  results  over  a  broad  frequency  band  with 
a  single  FDTD  calculation.  Thus  we  can  model  pulsed  systems  directly,  as  well  as  stepped-frequency 
and  swept-frequency  systems  through  postprocessing.  Finally,  we  are  using  the  damped  Higdon’s  outer 
radiation  boundary  condition  (ORBC)  [2].  This  ORBC  is  able  to  model  di.spersive  media  at  the 
computational  boundary,  including  the  air/soil  interface. 

In  this  paper  we  present  a  few  examples  of  using  the  FDTD  method  to  model  GPR  antennas  and 
applications.  The  method  has  previously  been  used  for  2.5D  simulations  (3D  source,  2D  geometry) 
[3].  Our  examples  are  all  3D  models,  although  we  have  also  used  the  method  for  2D  modeling. 

Bowtie  Antenna 

Figure  1  shows  a  3D  FDTD  bowtie  antenna  model.  This  is  a  top  view,  showing  the  FDTD  cell  edges 
that  are  modeled  as  perfect  electrical  conductors  (PEC).  We  are  using  two  symmetry  planes  in  this 
example,  reducing  the  FDTD  model  (in  memory  and  runtime)  by  a  factor  of  four.  The  left  face  has 
a  PEC  plane,  creating  mirror  symmetry  along  that  plane,  while  we  use  planar  symmetry  along  the  lower 
edge.  The  antenna  is  driven  along  the  single  center  element  at  the  lower  left  by  setting  the  field  at 
that  point  after  every  timestep.  This  method  of  driving  the  antenna  models  the  system  as  if  the 
transmitter  has  infinite  impedance;  we  are  modeling  only  the  antenna  here,  not  the  antenna  feed, 
although  that  is  possible.  The  FDTD  cell  resolution  is  1  cm  in  all  directions  and  the  bowtie  is  located 


740 


1  cm  above  the  surface  of  the  earth.  The  earth  extends  completely  to  the  edges  of  the  simulation 
volume,  except  at  the  left  face  (where  the  PEC  plane  is  located). 


We  can  use  the  FDTD  bowtie  model  to  find  the  antenna  input  impedance  as  well  as  the  field  pattern  in 
the  soil.  The  input  impedance  is  an  important  antenna  characteristic,  indicating  the  antenna  bandwidth, 
among  other  things.  Observing  how  the  impedance  varies  with  frequency,  antenna  configuration, 
geometry,  and  soil  conditions  aids  the  designer  in  producing  an  optimum  GPR  antenna.  We  are  also 
interested  in  the  pattern  and  amplitudes  of  the  fields  in  the  soil.  Using  FDTD,  we  can  sample  the  fields 
at  any  number  of  locations  within  the  ground,  for  example,  at  a  constant  depth. 

Figure  2  shows  the  input  impedance  of  the  bowtie  antenna  suspended  in  free  space;  the  results  agree 
with  measurements  by  Brown  and  Woodward  [4].  Figure  3  shows  the  input  impedance  when  the 
antenna  is  1  cm  above  the  ground.  As  expected,  the  soil  clearly  has  a  significant  effect  on  the  antenna 
characteristics.  The  magnitude  of  the  impedance  is  dramatically  reduced  and  the  broadband 
characteristics  are  shifted  to  lower  frequencies.  For  this  calculation,  we  modeled  the  soil  as  a  double- 
Debye  medium,  with  Debye  resonances  of  7  MHz  and  3  GHz.  The  soil  has  a  complex  relative 
dielectric  constant  of  the  form  in  Equation  1  and  the  parameters  in  Table  I. 

I  -  e  - ^  (1) 

2Ttfe^  Uf/fj  Kf/^ 

Table  I  -  Soil  Parameters  for  Bowtie  Model 


The  pattern  of  radiation,  both  in  the  air  and  in  the  ground,  is  an  important  operating  characteristic  of 
a  GPR  antenna.  Antenna  radiation  patterns  are  typically  reported  as  far-field  quantities.  But  in  GPR 
applications,  especially  those  detecting  shallow  buried  objects,  the  near-zone  fields  are  of  utmost 
importance.  Furthermore,  the  field  patterns  change  significantly  with  distance  in  the  ground,  due  to 
the  dissipative  and  dispersive  nature  of  the  soil.  Thus  the  traditional  concept  of  (far-field)  radiation 
pattern  is  not  as  meaningftil  in  GPR  antenna  analysis.  Instead,  we  are  usually  interested  in  the  pattern 
of  field  strengths  at  a  constant  depth.  This,  for  example,  is  meaningful  when  scanning  a  GPR  over  a 
buried  object. 

FDTD  models  all  the  fields  in  the  computational  volume.  By  saving  these  values  at  each  timestep,  we 
can  examine  the  fields  during  postprocessing.  Instead  of  saving  all  fields,  we  save  all  the  components 
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at  a  series  of  locations,  such  as  at  a  constant  depth.  After  converting  to  the  frequency  domain,  we  find 
the  fields  as  a  function  of  position  and  frequency. 

Figures  4  through  6  show  the  field  strengths  for  the  bowtie  antenna  model  shown  above.  Each  figure 
shows  the  electric  field  component  at  a  depth  of  15  cm.  Figure  4  shows  a  scan  of  along  the  X-axis 
(of  the  bowtie),  which  is  the  direction  of  orientation  and  polarization  of  the  antenna,  while  Figure  5 
shows  the  E^  (vertical)  component.  The  excitation  used  in  the  model  produced  broadband  results  (to 
over  800  MHz),  but  only  three  discrete  frequencies  have  been  shown  in  the  figures.  Figure  6  shows 
the  field  for  a  scan  along  the  Y-axis  perpendicular  to  the  bowtie. 

UHF  Balanced-Bridge  Detector 

Lastly,  we  present  3D  FDTD  calculations  showing  the  scattering  from  a  buried  target.  Figure  7  shows 
a  side  view  (cross  section)  of  a  UHF  balanced-bridge  detector,  used  in  buried  mine  detection  [5].  It 
consists  of  three  equally-spaced  dipoles;  the  outer  two  are  driven  out-of-phase  and  the  center  dipole  is 
the  receiver.  When  the  detector  is  placed  above  a  homogeneous  half-space  (no  target),  the  received 
signal  is  zero.  An  off-center  target  upsets  the  balance,  generating  a  signal  at  the  receiver.  As  the 
detector  is  scanned  across  a  target,  we  see  a  characteristic  double-hump  signature.  The  antenna  is 
typically  driven  at  a  single  frequency  near  400  MHz. 

We  are  modeling  the  UHF  balanced-bridge  using  a  pulsed  excitation  and  an  FDTD  cell  size  of  1  cm. 
We  use  mirror  symmetry  to  reduce  the  problem  size  by  a  factor  of  two.  The  antenna  liftoff  is  10  cm, 
the  target  is  a  10  cm  PEC  cube,  and  the  target  is  buried  5  cm  deep.  Figure  8  shows  the  received 
amplitude  for  a  scan  of  the  target,  at  two  frequencies.  It  shows  the  double-hump  characteristic  of  this 
type  of  detector. 

Conclusions 

We  have  shown  a  few  examples  of  how  the  FDTD  method  can  be  used  to  model  GPR  antennas.  While 
other  computational  methods  can  model  the  simple  antennas  shown  here,  the  FDTD  method  is  flexible 
enough  to  model  more  complex  antennas,  such  as  horns  and  cavity-backed  antennas.  The  FDTD 
method  also  accounts  automatically  for  any  antenna-antenna  interactions,  important  because  many  GPR 
systems  use  bi-static  configurations.  The  inherent  near-zone  nature  of  the  target  and  antenna 
interactions  and  the  broadband  bandwidth  of  many  GPR  systems  makes  the  FDTD  method  particularly 
attractive.  In  addition,  we  have  written  an  FDTD  code  that  displays  the  fields  as  an  animation;  viewing 
the  evolution  of  the  fields  often  gives  insight  into  antenna  operation,  as  well  as  visual  simulation 
diagnostics.  We  can  use  an  FDTD  model  to  generate  synthetic  data,  useful  for  antenna  optimization, 
system  characterization,  and  signal-processing  development.  In  summary,  we  find  the  FDTD  method 
to  be  a  useful  tool  for  GPR  antenna  and  system  engineering. 
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Figure  1 .  FDTD  Model  of  Bowtie  Antenna.  Top  view  showing  the  FDTD  cell  edges 
that  are  set  to  be  perfect  electrical  conductors. 
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Bowtie  Antenna  In  Free  Space 
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Figure  2  Impedance  of  Bowtie  Antenna  in  Free  Space.  Shows  both  the 
resistance  and  the  reactance. 


Bowtie  Antenna  over  Soil 
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i  3  Impedance  of  Bowtie  Antenna  over  Soil.  Suspended  1  cm  over  soil,  which  is 
modeled  as  a  double-Debye  medium  with  parameters  in  Table  I 
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Figure  6.  Bovvtie  Field  Pattern  at  Depth  of  1 5  cm.  Shows  along 
perpendicular  axis  of  antenna. 
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Side  View  (Cross  Section)  of  3D  FDTD  Model  for  UHT  Balanced-Bridge  Detector 
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Amplitude  at  Receiver  (arbitrary  units) 


UHF  Balanced-Bridge  Detector 


Target  Offset  (cm) 

Figure  8  Simulated  Response  of  UHF  Balanced-Bridge  Detector.  The  target  is 
scanned  in  the  direction  indicated  in  Figure  7, 
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PULSE  INTERACTIONS  WITH  NONRESONANT  AND  RESONANT 
MATERIALS  AND  STRUCTURES 
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Laser  pulses  arc  continuing  to  be  utilized  in  a  variety  of  advanced  commerical,  civilian, 
and  military  systems.  Their  bandwidth  and  intensity  have  been  increasing,  to  the  point 
at  which  the  materials  they  interact  with  no  longer  respond  in  a  linear  fashion.  The 
material  response  is  nonlinear  and  the  properties  of  the  materials  depend  on  the  shape 
of  the  pulse  propagating  in  them.  Moreover,  the  materials  have  memory  effects  so  that 
trains  of  multiple  pulses  can  produce  effects  similar  to  those  occuring  from  one  large  pulse. 
Despite  the  increa.se  in  complexity  of  the  associated  physical  properties,  these  nonlinear 
effects  offer  the  potential  for  a  variety  of  novel  device  and  systems  applications. 

Nonlinear  optical  (NLO)  devices  are  currently  being  explored  for  their  applications 
in  various  systems  as.sociated  with  communications,  remote  sensing,  optical  computing, 
etc.  However,  as  the  size  of  optical  devices  such  as  microcavity  lasers  is  pushed  to  the 
size  of  an  optical  wavelength  and  less,  the  need  for  more  exact  materials  and  response 
models  is  tantamount  to  the  successful  design  and  fabrication  of  those  devices.  Most 
current  simulation  models  are  based  on  known  macroscopic,  phenomenological  models 
that  avoid  issues  dealing  with  specific  microscopic  behavior  of  the  materials  in  such  NLO 
devices.  Inaccuracies  in  the  simulation  results  are  then  exaccerbated  as  the  device  sizes 
shrink  to  subwavelength  sizes  and  the  response  times  of  the  excitation  signals  surpass  the 
response  times  of  the  material.  There  are  laser  sources  currently  under  development  with 
sub-micron  wavelengths  that  are  pushing  the  boundaries  of  the  sub-femtosecond  regime. 
Phenomenological  non-resonant  models  lose  their  ability  to  describe  the  physics  in  this 
parameter  regime;  hence,  they  lose  their  accuracy  there.  Quantum  mechanical  effects 
begin  to  manifest  themselves;  the  simulation  models  must  incorporate  this  behavior  to  be 
relevant. 

Until  recently,  the  modeling  of  pulse  propagation  in  and  scattering  from  complex  non¬ 
linear  media  has  generally  been  accomplished  with  one-dimensional,  scalar  models.  These 
models  have  become  quite  sophisticated;  they  have  predicted  and  explained  many  of  the 
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nonlinear  as  well  as  linear  effects  in  present  devices  and  systems.  Unfortunately,  they  can 
not  be  used  to  explain  many  observed  phenomena,  and  expectations  are  that  they  are  not 
adequately  modeling  multi-dimensional  nonlinear  phenomena.  It  is  felt  that  vector  and 
higher  dimensional  properties  of  Maxwell’s  equations  that  are  not  currently  included  m 
existing  scalar  models  in  addition  to  more  detailed  material  and  device  structure  models 
may  significantly  impact  the  scientific  and  engineering  results.  The  associated  propagation 
and  scattering  issues  have  a  direct  impact  on  a  variety  of  applications,  particularly  on  the 
design  and  engineering  of  integrated  photonic  components  that  have  immediate  utility  to 
nonlinear  soliton  fiber  optical  communications  systems  currently  under  development.  It  is 
believed  that  the  successful  development  of  semi-classical  simulators  that  combine  numeri¬ 
cal  quantum  mechanical  models  of  materials  and  macroscopic  Maxwell’s  equations  solvers 
will  significantly  affect  the  concept  and  design  stages  associated  with  novel  nonlinear  optics 
phenomena. 

The  problem  of  accurate  numerical  modeling  of  the  propagation  of  ultrafast  pulses 
in  nonlinear  media  and  their  use  in  NLO  optical  devices  has  been  subject  to  increasing 
interest  in  recent  years.  Since  the  most  interesting  nonlinear  phenomena  are  transient  and 
superposition  is  not  available,  it  is  natural  to  try  to  carry  out  this  modelling  directly  in  the 
time-domain.  For  this  reason  the  finite-difference  time-domain  (FDTD)  method  is  receiv¬ 
ing  intensive  study  for  modeling  linear  and  nonlinear  optical  phenomena.  In  contrast  to 
the  case  for  frequency-domain  linear  analysis,  a  single  value  of  permittivity  e  is  completely 
inadequate  to  describe  nonlinear  time-dependent  phenomena,  and  it  is  essential  to  model 
the  interaction  of  the  electromagnetic  field  with  the  material  medium. 

Initial  simulations  of  these  ultrafast  optical  pulse  interactions  have  been  based  upon 
several  well-known  phenomenological  material  models^ They  included  the  linear  Lorentz 
dispersion  model,  the  nonlinear  Debye  model,  the  nonlinear  Raman  model,  and  the  instan¬ 
taneous  nonlinear  Kerr  model.  This  approach  has  allowed  an  investigation  of  the  usually 
neglected  longitudinal  field  component  and  polarization  effects  when  optical  beams  self¬ 
focus  in  bulk  materials  and  of  the  physics  underlying  the  design  of  optical  beam  stecrers  and 
output  couplers  constructed  from  corrugated  linear  and  nonlinear  waveguides.  Nonethe¬ 
less,  while  they  have  been  adequate  for  the  applications  considered,  these  phenomenological 
material  models  do  not  handle  well  fully  resonant  interactions.  To  understand  the  physics 
underlying  the  small- distance  scale  and  short-time  scale  interactions,  particularly  in  the 
resonance  regime  of  the  materials  and  the  associated  device  structures,  a  first  principles 
approach  is  desirable.  This  in  turn  requires  quantum-mechanical  descriptions  of  the  elec¬ 
tronic  states  available  in  the  medium.  Accurate  physical  models  must  incorporate  all 
propagation  effects  such  as  dispersion  and  nonlinearity,  with  the  proper  physical  linkages 
between  them. 

We  have  developed  a  simulator  that  utilizes  the  Maxwell-Bloch  system  for  multi-level 
atoms  for  our  material  models  in  Maxwell’s  equations.  This  effort  is  novel  in  that  it  com¬ 
bines  a  realistic  material  model  that  is  quantum  mechanically  based  with  a  full-wave,  vector 
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Maxwell’s  equations  solver.  The  FDTD  implementations  of  the  Maxwell-Bloch  modeling 
system  in  one  space  dimension  and  time  has  been  accomplished®  and  has  yielded  some 
interesting  physical  consequences.  In  particular,  in  a  density  matrix  approach  to  arrive 
at  the  Bloch  equations  describing  a  two- level  atom  medium  we  introduce  the  terms  p\ , 
^2i  and  />3,  which  satisfy  the  relationship  p\  p\  and  represent,  respectively, 

the  dispersive  or  in-phase  component  of  the  polarization,  the  absorptive  or  in-quadrature 
component  of  the  polarization,  and  the  fractional  difference  in  the  populations  for  the  two 
energy  levels.  The  near- resonant  behaviour  of  nonlinear  systems  cannot  be  meaningfully 
discussed  unless  dissipative  effects  are  taken  into  account.  The  usual  method  of  achieving 
this  in  simple  systems  is  to  include  in  the  Liouville  equations  for  these  terms,  phenomeno¬ 
logically  obtained  diagonal  terms  consisting  of  characteristic  decay  rates.  If  we  take  the 
incident  electromagnetic  held  to  be  a  uniform  plane  wave  that  is  propagating  along  the  ^- 
axis  and  is  polarized  along  the  x-axis;  i.e.,  E{r^  t)  =  Ex(z,t)x  and  H{r^t)  =  Hy{z,  t)y  and 
set  the  spatial  orientation  of  the  dipole  to  be  the  polarization  takes  the  form  P  ~  Px  x, 
where 

P e{0  1  Pl{^)  1  (1) 

Eatom  being  the  number  density  of  atoms  and  7  the  dipole  coupling  coefficient.  The 
following  one-dimensional  Maxwell-Bloch  system  results  from  this  reduction: 


Maxwell  equations 


Block  equations 


dtHy  = - d.  Ex 

a,  E,  =  -1  a,  H  -  i  o,p. 

eo  €0 


1  o  Eaiom  7  Eatom  7  ^0 

= - liy - —  Pi  -I - 

Co  J2 


eo 


dt  Pi  —  Pi  +  Wo  p2 

1  y 

9t  P2  =  pi  ~  Tf  -  P2  +  T  P3 
Jo  n 


eo 


P2 


dt  P3  =  -2  -  Ex  p2  -  jr  (/53  -  P3g) 


{2a) 

(26) 

(за) 

(зб) 

(3c) 


where  Ti  is  the  excited  state  lifetime,  T2  is  the  dephasing  time,  and  p^Q  is  the  initial 
population  difference  in  the  system.  Note  that  the  specification  that  p3o  =  —1  (  +  1) 
represents  all  the  atoms  initially  being  in  their  ground  (excited)  states.  This  system  of 
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equations  can  be  discretized  using  finite  differences  in  several  different  ways.  We  have 
characterized  the  performance  of  several  of  these  discrete  approaches  and  feel  that  we 
know  how  to  extend  this  semi-classical  model  to  higher  space  dimensions  and  to  more 
complex  media  such  as  a  three-level  atom  medium.  Our  best  approach  obtained  to  date 
will  be  highlighted  in  the  presentation. 

Using  this  FDTD  approach  to  solving  the  semiclassical  Maxwell-Bloch  system,  we  have 
studied  in  a  more  exact  manner  (without  removing  the  carrier  wave)  self-induced  trans¬ 
parency  effects  in  a  two-level  atom  medium.  Standard  self-induced  transparency  (SIT)  , 
the  so-called  tt,  27r,  4^,  ...  results  have  been  reproduced  with  this  model.  A  SIT  solution 
represents  the  nonlinear  wave  propagation  dynamics  in  which  a  particular  pulse  shape,  a 
carrier  at  the  transition  frequency  with  a  hyperbolic  secant  envelope,  having  the  appropri¬ 
ate  high  intensity  completely  loses  its  energy  to  a  two-level  atom  medium  by  stimulating 
it  from  its  ground  state  into  its  excited  state,  and  then  is  completely  reconstructed  in  a 
coherent  manner  via  stimulated  emission  by  having  the  excited  medium  completely  decay 
back  into  its  ground  state.  This  SIT  pulse  thus  propagates  through  the  highly  nonlinear 
two-level  atom  medium  with  no  change  in  its  shape;  i.e.,  as  though  the  medium  is  trans¬ 
parent;  it  is  a  soliton  solution  of  the  semiclassical  Maxwell-Bloch  system.  The  SIT  effect 
is  normally  described  with  a  rotating  wave  approximation  of  the  Maxwell-Bloch  system. 
As  will  be  demostrated,  we  have  recovered  these  SIT  pulse  dynamics  with  our  FDTD 
Maxwell-Bloch  simulator.  However,  we  have  also  found®  novel  features  that  appear  at 
points  where  the  electric  field  is  null  and  have  been  identified  as  being  associated  with  the 
maximuras  of  the  time  derivative  of  the  electric  field.  These  features  are  not  present  is 
standard  approximate  solutions  to  this  problem. 

These  nonlinear  time-derivative  effects  have  been  emphasized  further®  by  considering 
a  variety  of  ultrafast  pulse  cases.  It  have  been  demonstrated  that  during  ultrafast  pulse 
interactions  with  a  two-level  atom  medium  that  a  single  cycle  pulse  can  be  designed  that 
completely  inverts  the  two-level  atom  medium.  A  multiple  ultrafast  pulse  train  has  been 
given  that  can  completely  invert  the  medium  from  the  ground  to  the  excited  state  and 
then  completely  reverse  the  process.  These  results  confirm  that  the  time-derivative-driven 
nonlinear  properties  of  the  two-level  atom  medium  have  a  significant  impact  on  the  time 
evolution  of  this  system  in  the  limit  of  ultrafast  pulses. 

We  have  also  used®  the  FDTD  Maxwell-Bloch  simulator  to  recover  expected  small- 
signal  gain  results  for  sinusoidal  input  signals.  The  designed  ultrafast  inversion  pulse  has 
been  combined  with  a  sinusoid  input  signal  to  form  a  pump-probe  signal  set.  It  will  be 
illustrated  that  a  two-level  atom  medium  could  be  inverted  by  the  leading  ultrafast  pulse 
to  yield  a  gain  medium  for  the  trailing  sinusoidal  probe  pulse. 

Several  examples  of  the  SIT  and  the  time-derivative  pulse  propagation  effects  will  be 
shown  in  the  presentation.  Full  device  and  system  integration  complexities  are  presently 
being  introduced  into  our  model  by  considering  multi-dimensional  (spatial)  extensions.  We 
are  investigating  several  different  amplifier  and  microcavity  laser  configurations  with  the 
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resulting  multi-dimensional  FDTD  Maxwell-Bloch  simulator.  Progress  to  date  in  higher 
dimensions  will  also  be  reported. 
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/.  Introduction 

In  recent  years  the  growing  interest  in  nonlinear  electromagnetic  problems,  usually  related  to  electronic  devices  and 
systems,  has  resulted  in  several  studies  on  nonlinear  propagation  phenomena  1 1 1.  These  phenomena  typically  correspond 
to  mathematical  models  constituted  by  a  set  of  nonlinear  hyperbolic  partial  dilTcrential  equations.  Indeed,  the  main 
feature  of  this  kind  of  initial-boundary  value  problems  is  that  perturbations  propagate  with  finite  speed.  Often  these 
problems  are  solved  in  the  “weakly”  nonlinear  limit  [2];  in  this  way  the  dilTicullies  rehtlcd  to  the  resolution  of  nonlinear 
hyperbolic  problems  are  overcome.  On  the  contrary,  in  this  paper  the  electromagnetic  wave  propagation  in  a  “strongly” 
nonlinear  dielectric  is  studied  in  the  lime  domain,  directly  solving  the  set  of  the  Maxwell  equations  using  two 
complcmenttuy  formulations  (3)  and  the  Galerkin  method  |4]. 

It  is  well  known  that  nonlinear  hyperbolic  equations  can  generate  discontinuous  .solutions,  like  shock-waves,  even  it 
itie  initial  and  tlie  boundaiy  conditions  arc  regular  [5|.  Indeed,  in  contrast  with  diffusion  proces.ses,  the  propagation  of 
waves  does  not  lead  to  an  increa,sc  of  smoothness  of  the  solution  for  increasing  times.  Consequently,  it  is  not  possible,  in 
general,  to  prove  the  existence  of  smooth  solutions  of  nonlinear  wave  equations  for  all  times.  Roughly  speaking,  the 
collisions  of  different  parts  of  the  wave,  propagating  with  different  speeds  bccau.se  of  the  nonlinearity,  gives  ri.se  to  the 
appearance  of  shock  waves  (discontinuous  solutions).  Such  shock  waves  play  a  crucial  role  in  gas  dynamics,  where  they 
correspond  to  discotitinuitics  of  physical  quantities  (density,  prc.ssure,  velocity).  It  lias  been  shown  that  these 
discontinuities  may  appear  also  in  the  propagation  of  an  electromagnetic  wave  through  a  nonlinear  dielectric  slab 
exhibiting  the  Kerr  effect  [6]. 

Because  of  this  feature,  classical  numerical  methods,  like  finite  differences  and  Galerkin  methods,  may  yield 
utisatisfactory  re.sulls.  We  htivc  already  .shown  limt,  utider  particular  conditions,  the  Galerkin  equatimis  describing  the 
nonlinear  propagtition  may  exhibit  bifurcalion  and  chaotic  phenomcnti  [7];  thus,  the  numericitl  atuilysis  has  to  be  carried 
out  with  care.  Our  aim  is  to  investigate  how  well  complementary  formulations  and  Galerkin  methods  can  model  the 
Maxwell  equations  de.scribing  the  propagation  in  a  dielectric  shib  exhibiting  the  Kerr  effect  18).  Although  the  numerical 
methods  presented  in  this  papter  hold  for  three-dimensional  problems,  for  tiie  sake  of  simplicity  their  perlormance  has 
been  evaluated  by  solving  a  canonical  one-dimensional  case.  This  choice  h;is  been  made  lu'catise  the  lorntatioti  ol  sharp 
di.scontinuilies  of  the  front  wave  can  be  seen  also  in  this  “simple"  case. 

In  addition  to  the  Galerkin  method  wc  have  developed  two  scheme  ba.scd  on  the  central  linile  dillerences  14|  and  on 
the  characteristic  formulation  [9].  The  characteristic  method  promises,  wdth  little  modification,  to  be  able  to  describe  the 
di.scontinuous  soluliotis.  The  results  of  the  Galerkin  method  are  then  compared  with  the  ones  of  the  last  two  .schemes. 


2.  Problem  formulation 

Let's  consider  an  electromagnetic  plane  wave  and  suppo.se  that  it  is  incident  from  the  left  on  a  nonlinear.  Lsotropic, 
homogeneous,  nondisfiersivc,  time-invariatU  dielectric  slab  of  width  /.n  (Fig.l).  whose  constitutive  property  links  the 
electric  field  to  the  electric  di.splacement  at  the  .same  time. 

Wc  suppo.se  that  the  fields  K  and  D  arc  directed  in  the  x  direction,  while  the  fields  H  and  H  are  directed  in  the  y 
direction,  so  that  the  plane  wave  propagates  in  the  z  dircctioti: 
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Wc  wanl  lo  siiidy  llic  pi<)i):ii’ali(>(i  ol'  llic  clcflronia^nctic'  I'icid  in  ttiL'  slab.  Maxwell  etiualions  can  be  wrillcn,  for 

zg10./(i|  and  l>(),  in  iho  following  loini: 

f  dE  r)B 
I —  .() 

r)/,  r)t  j 

an  ao  , 

I  az  at  “  * 

To  solve  the  electromagnetic  problem,  we  add  to  ( I )  the  following  constitutive  relations,  representing  a  dielectric  slab 
exhibiting  the  Kerr  effect 

1  /  s  /  -.'I  D  ()<z<z,|  (2) 

1 

homogeneous  initial  condilions,  and  the  lollou  ing  lioiindary  conditions 
(7,,H(z-(E.i1.  E(/-()'.i)=21-:,(  ct) 

'  (3) 

[z,,H(z=  z-.t)-E(z-/,-;.l)-() 

where  Cd.  pc  are  respectively  the  permittivity  and  the  permettbiliiy  of  vticittim.  ty  tnid  ri  ;ne  two  positive  quantities 
describing  the  nonlinctir  dielectric,  Z(,  is  the  ch;u;icteristic  impedtince  of  tfie  medium  surrotiuding  the  sl;ib  (vacuum),  z.u  is 
the  slab  width,  and  E,  is  the  incidenl  wave,  which  is  supposed  to  be: 

[e  sin(27if^.  t)  ()<t<T 

E|(-ct)=  (4) 

[O  t<(),  t>T 

where  E,„,  f  ttitd  T  tire  the  ampliiiide.  the  lieLitiency  and  the  duration  of  the  incident  wave,  respectively.  The  boundtiry 
conditions  (3)  have  been  ohttmied  by  imposing  the  continuity  of  E(z,t)  tind  H(z.,t)  for  z=()  and  z=Z|,,  and  taking  into 
acct)unl  the  propagation  phenomenti  in  the  vacuum. 

It  is  interesting  to  note  iliat  while  in  the  ca.se  of  an  infinitely  long  sltib  wc  can  claim  ihth  there  is  an  instantaneous 
link  Ix-tween  f!  and  D,  now  this  is  not  [lossible  any  more.  This  happens  Ixtcausc  of  the  prc.scncc  of  a  .second 
diclectric/vacuum  interface,  on  which  the  inopagating  wave  (>artiaily  reflects:  as  a  consequence,  H  and  D  in  any  point  of 
the  slab  and  at  any  time  depend  on  the  wliole  history  of  H  and  D  themselves. 

It  is  convenient  to  put  (1)  -  (4)  in  an  dimensionless  form.  To  do  this,  we  write: 
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3.  Two  different  potential  formulations  and  Galerkin  schemes 

A  physically  well  posed  siatemcnt  of  the  problem  includes  the  relevant  equations,  the  initial  and  the  boundary 
conditions.  It  is  useful  practice  to  enforce  some  of  them  explicitly.  The  .solution  formulation  is  then,  essentially,  a 
prcKredurc  for  imposing  the  remaining  equations. 

In  this  paper  a  new  lecliniqiic.  based  on  complemeiitaiy  formulations,  first  u.scd  for  magnetostatic  problems  [3],  is 
proposed  for  solving  the  sysicm  of  equations  (5)  -  (X)  and  for  e.stimating  the  errors  due  to  the  numerical  approximation. 

Two  compicmcniai'y  foi'mulalions  ot  the  equations  (.s)  -  (X)  are  consiticred:  each  ol  them  cnlorces  one  of  the  Maxwell 
equtitions.  In  the  K-formuiation  the  unknowns  d  and  h  are  expres.scd  in  terms  of  the  clcaric  potcniial  f(C,t)  as 
di 

d  =  — 
r')f 

h  =  — 
fh 

whereas  in  the  A-formul;ilion  ihev  are  expressed  in  terms  rif  the  mmnciiv  poteniuil  a{^,t)  as 

f  rTl 
'h=  — 

'<K 

da 
dr 

In  3D  problems  we  would  introduce  the  electric  (magnetic)  vector  potential  F  (A)  such  tliat  VxF=D,  H=+dF/dt 
(VxA=n,  b>-d.A/dl). 

In  each  formulation,  the  remaining  .Vlaxwcll  equation  has  to  lx;  enforced  explicitly.  To  do  this,  it  is  rewritten  in  a 
weak  form  first:  then,  it  is  di.screti/.ed  in  the  space  by  means  of  the  Galerkin  method.  The  boundary  conditions  (7)  are 
impo.scd  as  natural  boundary  conditions.  Finally,  the  ordinary  differential  equations  obtained  in  this  way  arc  solved  in 
time  by  a  fourth-order  Runge-Kutta  method. 

As  far  as  the  error  introduced  by  the  Oalerkiii  method  is  concerned,  it  may  lx;  estimated  by  solving  the  same  problem 
using  both  the  formulations  |3j.  In  this  way.  taking  the  difference  Ivtwcen  the  .solutions  coming  from  the  F-formulation 
and  the  A-formulalion.  a  measure  of  the  numerical  error  is  obtairred.  This  means  that  we  are  making  the  following 
assumpiioit:  tire  smaller  is  the  difference  Ix'iween  the  .solutions  of  the  two  fonnulatioris,  the  clo.ser  we  arc  to  the  exact 
solution.  This  assumption  allows  an  estimation  of  the  global  error  and  provides  a  u.seftrl  indication  for  spatial 
di.scrcti/.alion  ref  incur  cuts. 


3.1  Electric  vector  potetitial 

Introducing  the  electric  potential  f=f(C.")  Ihe  second  equation  of  system  (5)  is  automatically  satisfied,  while  the  first 
can  be  written  iit  the  following  form; 

Idh  dffafVi 


Jcll  (q) 

df 

I  —  +  h  =  (1 
U>T 

to  which  we  must  add  initi;il  and  boundary  conditions. 

Now  wc  give  a  weak  form  of  (9);  to  do  this,  we  define  the  scalar  product  of  two  functions  u(0  c  v{Q  as: 

(ufo.v(c)>  =  |/j(c)v(i:)dc 

Giving  a  weak  form  of  (9)  means  pi'ojecting  the  residuals  of  (9)  on  a  suitable  set  W  ot  lest  lunctions  w(Q  and  to  vanish 
thc.se  projectioirs; 
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r)  r))' 
w, —  C  — 


\  aw  \  Vw(0£W 

[(w.£).(w,h>  =  0 

Intcgraliiig  by  pails  and  using  boundary  condilions  (7)  wc  get 


—  (w  h)-/— ,c[— |)  +  Yw{0)—  +7w(l) —  +2w(0)Cj(t) 

dT  \  rX  \rX  )l  -0  ^ 

—  {w.l)--(w.h) 


Vw(0e  W 


Note  that  using  this  approach  the  boundary  condilions  arc  automatically  taken  into  account  in  ( 10). 

To  solve  (10)  numerically,  we  divide  the  10,1  ]  interval  in  N  subintcrvals  (one-dimensional  Imitc  element  mesh);  then, 
we  approximate  the  unknown  liclds  h  and  I  as  linear  combinations  of  a  finite  number  N  of  basis  lunctions  w.fQ; 


Choosing  as  w,  ihe  usual  finile  elemeni  basis  funclions.  h,  and  f,  are  exactly  the  nodal  values  ol  Ihe  liclds  h  and  f.  Then, 
we  follow  the  Galerkin  method:  itic  basis  funclions  w.  just  introduced  arc  cho.sen  also  as  the  test  lunctions  on  which  we 
project  the  residuals.  Doing  so.  from  tlie  weak  form  (10)  we  gel  the  following  system  of  ordinary  dillercntia!  equations,  in 
which  the  unknowmsarc  Ihe  vectors  1'  =  {  1,  )  and  ii  =  |  h,  1: 


Lh  =  g(f)-  Rh  +  s(t) 

lLlX..-Lh 

I  dw'  I 

where  L„=<w.,w,>  i,,j=l.N  .  R=diag(7.() . O.y)  .  s(t)-(2c,(t).0 . 0)  and  g, (f)  =  (  — -c|2w' i ' 

Note  that  dcKLl^^O,  since  Wi . w^  are  linearly  independent.  Consequently,  (II)  can  be  solved  tor  I  and  h. 


3.2  Ma\>netic  vector  potential 

Introducing  liie  magnetic  potential  a=a(C.T)  the  first  of  (5)  is  automatically  .satisfied,  while  the  second  one  can  be 
written  in  the  following  form: 


3d 

a- a 

X - =  0 

fix 

K 

da 

+  e(d)  =  () 

fix 

Performing  the  same  calculations  as  before,  we  get  the  lollowing  equations 

dx'  \rX  x/  Y  Y 

I  — (w,a)^-(w,e(d)) 

i  dx 

Starling  from  (12),  and  introducing  d  =  |  d,  1  and  a  =  1  a,  ),  wc  gel: 


Vw(0eW 
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Ld  =  -— RL"  q(d)+Sa  +-s(t) 

y-  y 


La  =  -q(d) 


where  q|(d)  =  (  w^.d 


i=l.N  and  S;,  = 


dw;  dWj 

di;  '  dC 


(13) 


4.  Numerical  experiments  and  discussion 

We  have  studied  the  proiiaeatioii  of  an  clcetroinagnelic  pulse  through  the  noniincar  dielectric  slab.  All  the  numerical 
simulations  proposed  in  this  paper  have  been  performed  considering  "pi,  v=IO  (the  thickness  of  the  slab  is  ten  times  the 
wavelength  of  the  incident  field),  and  VLfI/^O  (the  pulse  durtition  of  the  incident  field  is  equal  to  1/2  wavelength). 

In  the  following,  we  indicate  with  (dA.  IVa)  and  (d,..  h,.)  the  d.  h  fields  obtained  with  tlie  A-formuIation  and  the  F- 
fonn  u  I  at  ioti  res  pect  i  ve  1  y . 

Figure  2  refers  to  a  "weakly"  nonlinear  case  (the  dimcnsionlc.ss  amplitude  of  the  incident  wave  is  CkfO.I);  the 
wavefrirm  is  not  strongly  deformeti.  Figure  2;i  shows  the  Held  dA,  while  Fig.  2b  sliows  the  eri'or  ol  the  Gtilcrkin  method 
(normalized  difference  of  d.^  and  di  ).  wliicli  dcmonstrtites  that  di-  is  very  similar  to  dA.  Ttie  results  ohltiined  with  the  finite 
difference  .scheme  (FDS)  and  the  characteristic  .scheme  (CS)  (sec  Appendix  A  and  B)  arc  very  clo.se  to  dA.  with  the 
exception  that  the  CS  scheme  does  not  shows  o.scillations,  while  the  FDS  .shows  o.seillalion.s  at  the  end  of  the  pul.se.  Thus, 
we  can  state  that  all  the  proposed  .schemes  are  able  to  study  such  a  situation,  in  which  the  nonlinearity  doesn’t  play  any 
significant  role. 

Figure  3  refers  to  a  "slrongly"  nonlinear  ease  (Gu=0.4).  with  N=10(K):  the  results  of  all  schemes  arc  shown.  The  CS 
(Fig.  3c)  shows  the  formation  of  a  shock;  lune  that  continuously  changing  the  parameter  Go  turns  into  an  abrupt 
qualitative  change  in  the  properties  of  the  solution.  This  is  probably  related  to  the  appearance  of  bifurcations  and  chaotic 
phenomena  [7).  The  CS  solution  is  taken  in  this  case  as  the  reference  solution,  because  this  .scheme  is  naturally  able  to 
capture  the  di.scontinuous  solution  exactly  191,  while  the  other  .schemes  don’t,  it  is  straightforward  tliat  the  FDS  (Fig.  3d) 
is  completely  unsuitable  U)  study  such  a  situation,  since  its  .solution  is  both  quantitatively  and  qualitatively  wrong.  On  the 
other  hand,  the  Galcrkin  scheme  (Fig.  3a)  gives  a  qualitatively  good  .solution,  which  oscillates  before  the  arrival  of  the 
shock.  Anyway,  the  error  (Fig.  3b)  (defined  ;is  bcibre)  .shows  that  the  dr  solution  strongly  oscillates  after  the  pulse  has 
passed.  Wc  have  vcrilled  that  the  o.scillations  saturate  after  a  suitable  period  of  time.  This  o.scillation  is  certainly  due  to 
the  different  order  of  approximation  of  (Ia  and  di  i  while  dA  is  piecewise  linear,  dj^  is  only  piecewise  constant,  since  it  is 
the  spatial  derivative  of  f,  which  is  piecewise  linear.  This  conclusion  is  confirmed  by  the  perfect  duality  of  the  behavior  of 
the  magnetic  fields  hA  and  hi-.  Moreover,  wc  want  Ut  stress  the  fact  that  a  discontinuous  function  (as  the  solution  in  this 
case)  has  an  infinite  Fourier  spectrum,  with  significant  very  high  frequency  components.  If  either  the  equations  or  the 
scheme  were  dissipative  (c.g.  viscous  processes  or  .schemes),  these  high  frequency  components  would  be  attenuated;  since 
both  the  equtilions  are  lossless  hyperbolic  and  the  Galcrkin  scheme  coupled  with  a  Rungc-Kutta  scheme  does  not 
introduce  numerical  viscosity,  these  components  arc  not  attenuated  and  contribute  to  undcsired  o.scillations. 

These  conclusions  are  confirmed  by  Fig.  4.  in  which  the  Galcrkin  scheme  solution  for  N=.^()()0  is  prc.sented.  It  is  clear 
that  while  the  dA  .solution  (Fig,  4a)  seems  quite  better  th;in  the  previtnis  one.  the  error  is  asymptotically  much  greater, 
indicating  a  severe  o.scillation  of  the  d^  solution.  In  normal  conditions,  increasing  the  number  of  suhintervals  N  would 
lead  to  an  improvement  of  the  solution,  as  in  the  weakly  noniincar  case.  In  this  ca.se,  since  the  discontinuity  is  much 
sharper  (the  .solution  is  closer  to  the  ideal  di.seontiiiuoiis  one)  the  excitation  of  high  frequency  components  is  more 
relevant  than  before.  This  in  turn  cau.scs  a  stronger  undesired  o.scillation;  in  adriition,  the  Irequcncy  of  this  oscillation  is 
higher  than  previously. 

Finally,  in  Table  I  the  r.m.s.  error  e  =  ^(d,^  i  l^iven  for  different  values  of  N  and  Go. 

When  G,f().  I  (weak  nonlinearity),  c  decreases  when  N  increases;  this  means  that  wc  arc  converging  to  the  exact 

.solution.  On  the  other  hand,  lor  Gif0.4  (strong  nonlinearity)  this  is  nth  true  ;tny  more;  increasing  N  results  in  a  global 

worsening  of  the  solution,  even  if  it  gets  closer  to  the  idctil  one  in  the  region  in  which  the  pulse  is  present. 
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N=10() 

N=500 

N=1  ()()() 

N=5000 

Ga=().  1 

e=7.791{)-‘ 

e-6.08  l()-' 

e=2.I310' 

e=l.8.51(f’ 

Gi)=0.4 

e=7.84-l()-' 

£=.3.45- 10' 

£=7.2  IK)-^ 

£=1.77  10'^ 

Table  I 


All  ihcse  remarks  dcnuinstralc  tliat  sluclyiag  llic  convergence  ol'  a  Galcrkin  scheme  even  in  a  simple  nonlinear 
propagation  problem  is  still  an  open  question. 
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Appendix  A:  finite  difference  scheme 


Now  we  will  dc.scribc  how  to  solve  (,S)  -  (8)  with  a  finite  difference  scheme.  Wc  divide  the  whole  .slab  (0<i^<l)  with  N 
points  .spaced  of  A,  and  wc  approximate  the  spatial  derivative  with  a  .second  order  finite  difference  scheme  (central 
finite  differences): 


2A 


e  =  o  -A 


Thus,  we  can  write: 


()i  2A 


(A.l) 


I  ^  2A 

w'herc  hj  c  d,  arc  the  unknowns  (nodal  values  of  the  ficld.s).  So,  wc  get  a  system  of  ordinary  differential  equations,  which 
can  be  solved  with  one  of  the  available  methods:  in  particular,  we  used  a  fourth-order  Rungc-Kutta  algorithm. 

Obviously,  w’c  must  add  to  (A.l)  initial  and  boundaiw  conditions;  moreover,  (A.l)  can  Ix'  written  only  for  j=2,...,N. 
indeeti.  on  the  bound:irie.s  (j^l  and  J=N),  wc  can  use  only  one-sided  finite  differences,  since  w'c  can  use  only  internal 
points;  as  a  consequence,  we  must  use  at  least  three  points  to  obtain  .seettnd  order  .schemes. 

In  partictihu'.  we  used  the  Idlknving  sccorui  order  sciiemes; 


- 2A . 

X  j  = - 


e  =  o(a'] 


e  =  o{A') 
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Appendix  B:  characteristic  formulation  and  numerical  scheme 


Using  Ihc  equation  (6),  the  system  (5)  can  he  written  in  the  lollowing  form: 

^h  •, 

—  +  V(d)  —  =  0 

dx  ac 

ad  ah 
-—+—  =  0 
at  a^ 

ac 

where  v“  =  —  =  1  +  Id*^  Let's  consider  the  following  ordinary  differential  equations: 

ad 

[ac 

—  -  +  V 
dT 


(B.l) 


(B.2) 


dT 


The  integral  curves  of  (B.2)  arc  called  Ihc  cluucicierisiic  ciinr.s  C+  and  C.  of  problem  (B.l),  Assuming  that  a+  and  a. 
arc  smooth  enough,  the  characteristic  curves  of  the  .same  family  do  not  intersect:  in  this  case,  we  can  introduce  a  couple  of 
curvilinear  coordinates  ot(i^,T)  and  (5(Ca)  such  that  a  ([))  is  constant  over  C+  (C.)  curves. 

It  can  be  dcmon.stratcd  [5,  9]  that  with  these  positions  (B.l)  become: 


ail  ad 

—  +a^.  —  =  0 
a  a  aa 
ail  ad 
— +a_  —  =  0 

a(i  ■  a|) 

aa  ^  da 


(B.3) 


— 

ap""^”  a(i 

which  is  called  chavodvi'istw  .sYstcin.  in  which  the  unknow'iis  arc  t,  h,  d  as  functions  ol  a  and  (i. 

Now  wc  describe  how  to  solve  (B.3)  numerically  19).  Let’s  suppose  that  the  values  of  h  and  d  are  known  in  evety 
point  of  a  curve  1  (Fig.  B.l)  in  the  (C-T)  plane  and  let’s  consider  two  points,  say  A  and  B,  on  1.  The  C+  characteristic 
starting  from  A  and  the  C.  characierislic  starling  from  B  will  meet  on  E.  Integrating  the  lirst  iw'o  ol  (B.3)  along  C  and  C+ 
respectively  we  will  gel  two  e.spressions  lor  the  value  tUCiaT,,);  equaling  these  tw'o  values  we  get  a  nonlinear  equation  for 
d(CF-;Ti:).  .Solving  this  nonlinear  equation  and  integrating  the  last  two  of  (B.3)  we  obtain  the  values  on  E  of  all  the 
unknowms. 
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An  Efficient  Sub-gridding  Algorithm  for  FDTD. 

D . T, Shimizu ,  M .  Okoniewski  M . A . S tiichly 

Department  of  Electrical  and  Computer  Engineering 
University  of  V'idoria.. 

Victoria.  B.C.,  Canada 

Many  electromagnetic  problems  which  can  be  effectively  modeled  using  FDTD  techniques,  have 
structures  comprising  large  regions  where  fields  vary  slowly,  and  a  few  small  areas  wlu're  dramatic 
changes  of  the  hold  components  in  space  are  expected  and  a  fine  rliscretization  is  required.  In  a 
straightforward  approach  a  dense' grid  is  usefl  throughout  the  computational  space.  'Fliis  frequently 
is  not  feasible  due  to  the  com])uter  resources  constraints,  riierefore.  a  few  scheme's  have  been 
previously  devele)['»ed  for  a  dynamic  change  of  the  discretization  density  in  the'  hUi)d'D  algorithm. 
They  include  mesh  rehnement  in  both  time  and  space  [2,  3,  4].  and  mesh  refinement  in  space  e)nlv 
[■^^]  ' 

While  the  latter  approach  is  much  simplei-  te)  impleenent..  the  time  and  space  std)-gridding  yields 
a  much  more  efficient  code.  This  results  from  the  fact  that  in  the  second  approach  the  stability 
condition  has  to  be  computed  in  the  smallest  cell  used,  and  thus  a  small  time  step  utilized  in  the 
whole  problem  space. 

We  have  developed  an  algorithm  that,  provides  mesh  refinement  in  space  and  time.  4'he  previ¬ 
ously  developed  algorithms  [3,  1]  were  !!ot  described  in  sufficient  details  to  allow  us  the  comparison 
between  their  and  our  method.  We  have  placed  the  greato'st  ('mphasis  on  providing  a  sinooth  tran¬ 
sition  between  the  coarse  and  fine  meshes.  To  achieve  this,  ext.rapolation  in  time  and  interpolation 
in  space  are  used,  both  accurate  to  tlie  second  order.  Since  the  boundary  of  the  two  grids  is  usually 
posit  ioru'd  in  a  homogeneous  region  of  the  structure  where  the  field  components  and  its  derivatives 
have  smooth  behaviour,  the  second  order  interpolation  and  ext  rapolation  is  expected  to  provide 
sidficient  accuracy. 

4  he  following  guidelliu’s  were  set  before  develoj)ing  t  he  actual  algorithm: 

•  tlu'  same  stability  condition  should  be  used  throughout  the  pi'obh'm  s])ace  to  keep  the 
numerical  dis[)ersion  constant  [2] 
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Figure  1:  The  overlapping  meshes  adopted  in  the  sul>gridding  algorithm 

•  to  minimize  the  reflections  from  the  fine  mesh,  small  scaling  factors,  and  a  sequence  of 
sub-grids  rather  than  an  abrupt  transition  into  much  finer  mesh  should  be  used. 

As  a  main  sub-gridding  scheme  an  algorithm  with  a  scaling  factor  of  2  was  adopted  (Fig.l).  The 
two  meshes  are  offset  in  space  so  that  magnetic  fields  of  coarse  and  corresponding  magnetic  field 
of  dense  mesh  are  aligned.  Figure  1  schematically  presents  the  alignment  of  the  two  meshes  at 
their  boundary. 

Since  the  frequency  of  field  updates  is  twice  as  fast  in  the  fine  inesh  than  in  the  coarse  one,  there  is 
not  enough  information  to  keep  the  updates  of  the  most  external  nodes  of  the  fine  mesh  computed 
via  FDTD  scheme.  The  pulsing  overlapping  scheme  was  developed,  where  the  external-most  layer 
of  the  fine  mesh  is  dropped  in  each  time  sub-step  of  the  refined  mesh.  At  the  and  of  the  cycle,  the 
mesh  is  ex[)anded  back  to  its  original  size,  and  the  missing  field  components  are  computed  using 
interpolation  and  extrapolation.  The  gray  region  in  the  Fig.l  indicates  the  pulsing  region. 

In  brief  the  algorithm  can  be  described  as  follows  (Capital  letters  denote  field  quantities  of  the 
coarse  mesh): 


1.  'I'ime  t  =  n:  Interpolate  e"  at  the  overlap])ing  strip  of  meshes  from  E" 

2.  Time  t  =  n-|-l/4:  find  within  the  sub-gridded  mesh,  using  FDTD 

3.  Timet  =  n  +  l/2: 
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Figure  2:  Reflections  from  the  interface  with  the  fine  mesh  inserted  in  the  homogeneous  region  of 
a  coaxial  line.  The  fine  mesh  size  in  the  radial  direction  is  a  parameter  within  the  plot  family. 

•  collapse  the  fine  mesh  and  find  using  FDTD 

•  find  a  coarse  mesh  using  FDTD 

4.  Time  t=n+3/4: 

•  collapse  the  fine  mesh  and  find  using  FDTD 

•  interpolate  and  to  refine 

•  expand  mesh,  use  extrapolation  in  time  and  interpolation  in  space  to  obtain  missing 

•  /i"+3/4 

5.  Timet=n+1;  obtain  and  using  FDTD 

6.  cycle 

This  sub-gridding  was  implemented  in  a  2D  cylindrical  TM  mode  FDTD  code  (only  Er,  Eg  and 
field  components  are  allowed).  A  number  of  test  were  carried  out,  using  a  homogeneous  coaxial  line 
and  a  coaxial  line  with  discontinuities.  Figure  2  illustrates  some  of  the  results  obtained,  namely  the 
reflections  generated  by  the  sub-gridding  scheme  in  a  homogeneous  coaxial  line.  The  computations 
have  been  performed  for  a  Teflon  coaxial  line  of  0.456  and  1.49  mm  inner  and  outer  conductor 
radius,  respectively.  The  density  of  the  coarse  mesh  is  0.1mm,  and  the  fine  mesh  dimension  is 
shown  as  a  parameter  in  the  figure.  It  is  worth  noticing,  that  the  reflections  introduced  by  the 
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Figure  3:  Reflections  from  the  interface  with  the  fine  mesh  inserted  in  the  homogeneous  region 
of  a  coaxial  line.  The  fine  mesh  size  in  the  longitudinal  direction  is  a  parameter  within  the  plot 
family. 

refined  mesh  do  not  exceed  .35%.  The  developed  scheme  of  sub-gridding  satisfies  the  criteria  of 
the  stability  and  low  reflections.  Its  main  limitation  is  a  small  factor  of  mesh  subdivision  and  as 
a  consequence  a  need  for  a  recursive  sub-gridding  if  a  large  mesh  size  reduction  is  required.  To 
the  best  of  our  knowledge,  there  is  no  efficient  algorithm  presently  available  that  satisfies  all  three 
requirements. 
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Using  the  Integral  Forms  of  Maxwell’s  Equations  to  Modify  and 
Improve  the  FDTD  (2,4)  Scheme 

Mohammed  F.  Hadi 
Prof.  Melinda  Piket-May 
Department  of  Electrical  Engineering 
University  of  Colorado  at  Boulder 


INTRODUCTION 

One  serious  flaw  of  the  finite-difference  time-domain  method  which  received  little  attention  so  far  is  the  excessive 
phase  error  that  accumulates  in  the  field  calculations  as  the  EM  waves  advance  in  the  numeric  grid.  In  the 
standard  (2,2)  scheme  (second-order  differences  in  time  and  space)  this  phase  error  changes  as  a  function  of 
the  propagation  angle,  a,  with  maximum  error  when  propagation  is  along  the  principal  numeric  grid  axes  and 
minimum  error  when  the  propagation  angle  is  at  45®  off  the  principal  grid  axes. 

To  illustrate  the  above  statement,  a  twodimensiona!  example  is  chosen;  radiation  from  an  infinite  line  source. 
To  model  this  problem  the  2-D  space  is  divided  following  the  example  of  Yee  [1]  and  the  TM  Maxwell’s  equations 
are  discretized  as  follows 
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-  —\h  rU 
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-  fAx 

—  ^  fft  r  1  . 
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where  Aa:  =  Ay  =  /t  =  A//2  (with  fi  =  20  typical).  The  stability  criterion  [2]  forces  a  maximum  limit  on  the 
time  step  that  can  be  chosen  for  the  algorithm 
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where  i>  is  the  courant  number.  The  courant  number  is  defined  this  way  (u  =  At/A<,nax)  to  provide  a  universal 
definition  that  is  independent  of  the  FDTD  scheme  used. 

The  dispersion  relation  which  is  given  by  [3] 


h  V  kh  cos  a  \  .  2  (  sin  a 

-)  .siiF  -hsin 


can  be  used  to  find  the  expected  phase  error  in  the  algorithm  as  a  function  of  the  propagation  angle,  a,  by 
solving  it  for  the  numeric  wave  number  I  and  comparing  this  1  with  the  physical  wave  number,  k.  Fig.  1  shows 
the  normalized  error  in  the  numeric  wave  number  versus  the  propagation  angle  for  a  resolution  factor  of  ft  =  20, 
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Fig.  I  Normalized  error  in  the  numeric  ivave  /unnber 
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When  the  propagation  angle  (q)  and  resolution  factor  (/?)  are  fixed  and  the  dispersion  relation  is  solved  for 
I-  at  different  time  steps,  i.e.,  at  different  values  of  iz  within  the  limits  set  by  the  stability  criterion,  it  was  found 
that  decre<asing  the  time  step  will  cause  the  error  in  k  (which  controls  the  phase  error)  to  increase.  Fig.  2  shows 
how  the  error  in  k  can  translate  into  a  phase  error  in  the  fMOTO  algoritlim. 


Fig.  2  Electric  field  radiated  by  an  infinitely  thin  line  source: 
Exact  (solid  line)  vs.  FDTD  (dotted  line). 

Phase  error  is  approximately  18°  at  p  —  25 A 


The  plot  in  this  figure  represents  a  snapshot  in  time  of  liie  electric  field  at  a  distance  range  of  25-25  wave¬ 
lengths  away  from  the  line  source.  The  solid  line  represents  the  exact  solution  given  by 


p  >  0 


and  the  dashed  line  represents  the  FDTD  solution.  The  line  current  is  chosen  as  a  900  MHz  sinusoid.  What 
Fig.  2  is  saying  is  that  the  standard  (2,2)  scheme  exhibits  a  phase  error  of  approximately  18®  when  the  wave  has 
travelled  2<)  wavelengths  with  a  resolution  factor  of  R.  =  20.  Please  note  that  the  error  in  the  wave  amplitude  is 
due  to  the  inaccurate  modeling  the  singularity  at  p  =  0,  the  location  of  the  “infinitely”  thin  line  source. 


THE  STANDARD  (2,4)  SCHEME 

The  standard  (2,4)  scheme  refers  to  FDTD  with  second-order  differences  in  time  and  fourth-order  differences  in 
space.  The  corresponding  updating  equations  for  the  TM  Maxwell’s  equations  are  given  by  [4] 

~  E,\lj  I  jn  +  5  oyu  I  nj//  [”+=  + 

- At -  “  24;A^i  1.-'-  + 

-  27£,|”,_,  +  27e,|”,,i  - 


At  24/iAj/  I 

Using  the  same  technique  as  for  the  (2,2)  scheme  the  stability  criterion  can  be  derived  as 

(6/7)/, 
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where  v  is  again  defined  as  v  =  and  the  dispersion  relation  as 
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Tlie  results  obtained  from  solving  this  dispersion  relation  will  be  explained  in  the  next  section. 

THE  MODIFIED  (2,4)  SCHEME 

Starting  with  one  of  the  updating  equations  from  the  Standard  (2,4)  scheme  and  witli  the  help  of  Fig.  3  a 
different  form  of  this  equation  can  be  derived  that  will  direct  the  attention  to  a  way  of  improving  the  algorithm. 
For  example 
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Knowing  that 
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tlie  original  updating  equation  can  now  be  written  as 

[  J  FDTD  ^  I-  J  C2  ®  I-  J 


This  relation  shows  that  the  standard  (2,4)  updating  equation  is  nothing  but  a  weighted  sum  of  the  results  from 
applying  the  modified  Ampere’s  law  on  two  different  loops,  Cj  and  Cn  as  shown  in  Fig.  3  (the  bigger  hollow  arrows 
and  white  circle  represent  the  fields  used  by  the  standard  (2,4)  scheme).  Note  that  the  two  coefficients  at  the 
right  hand  side  add  up  to  unity  to  preserve  the  integrity  of  Maxwell’s  equations. 


Fig.  3  The  modified  Aoipcre’s  law  applied  on  Vee's  TM  Grid 


This  new  form  of  the  standard  (2,4)  updating  etiuation  shows  a  straightforward  way  to  include  all  twelve 
field  values  along  the  outer  loop.  However,  this  will  force  a  change  in  the  values  of  the  two  right  hand  side 


and  the  dispersion  relation  as 
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'I'o  find  the  optimuni  values  for  A"  and  A'"  the  dispersion  relation  is  again  solved  for  the  numeric  wave  number 
in  terms  of  A'  and  A'"  at  all  propagation  angles  and  the  results  are  plugged  into  the  following  error  function 


TT  /n  K 


This  error  function  is  in  turn  minimized  using  an  optimization  routine  in  terms  of  A  and  A"  to  find  the 
optimum  values  that  will  give  the  least  global  phase  error.  Table  1  shows  the  optimuni  coefficients  at  /  :=:  900  Mllz 
for  several  resolution  factors  along  with  the  corresponding  error  values.  It  is  important  to  resolve  the  coefficients 
to  9  significant  digits  since  these  coefficients  control  the  EM  fields’  phases  which  arc  more  sensitive  to  small 
variations  than  the  fields’  magnitudes.  Fig.  5  shows  an  error  comparison  among  tlie  three  schemes  discus.sed  so 
far;  the  standard  {2,2),  the  standard  (2,4)  and  the  modified  (2,4)  schemes. 


Table  1.  Optimum  values  for  A  and  A"  at  900  Mllz 


R 

K 

K" 

Error2 

5 

-0.144931712 

0.102068902 

5.426  X  10-^° 

10 

-0.116192765 

0.0734445091 

8.979  X  10-''’ 

15 

-0.111802038 

0.0692811040 

6.444  X  10"'® 

20 

-0.110322272 

0.0678920244 

1.963  X  10"'' 

25 

-0.109646972 

0.0672605236 

1.264  X  10-'“ 

30 

-0.109282656 

0.0669204694 

1.208  X  lO-"' 

35 

-0.109063833 

0.0667164343 

1.283  X  10---'° 

When  it  comes  to  actual  simulations  the  modified  (2,4)  scheme  demonstrates  an  important  advantage  over 
the  standard  (2,4)  scheme.  The  modified  (2,4)  scheme  can  be  used  in  a  hybrid  algoritliin  along  with  the  standard 
(2,2)  scheme  without  any  adverse  effects  on  the  stability  of  tlie  overall  algorithm.  This  is  not  the  case  with  the 
standard  (2,4)  scheme  [5].  Going  back  to  the  original  example  of  an  infinite  line  source  and  solving  it  using  the 
modified  (2,4)  scheme  yields  a  clean  and  phase  error-free  wave  solution  as  shown  in  Fig.  6. 
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CONCLUSION 


When  used  to  model  electrically  large  structures  the  modified  (2,d)  scheme  is  capable  of  providing  huge  savings 
in  both  computer  time  and  memory.  For  example,  to  coiiTine  the  FDTD  algorithm  within  <  40  parts  per 

million  the  modified  (2,4)  scheme  will  require  a  resolution  factor  of  ft  =  5  compared  to  a  resolution  factor  of 
ft  =  142  for  the  standard  (2,2)  scheme.  In  3-D  simulations  this  142/5  ratio  needs  to  be  raised  to  the  fourth 
power  to  realize  the  computer  time  ratio  and  to  the  third  power  to  realize  the  computer  memory  ratio  between 
the  two  schemes.  Even  when  the  extra  overhead  generated  within  the  modified  (2,4)  scheme  is  taken  into  account 
the  rations  are  still  huge;  over  5  orders  of  magnitude  for  the  computer  time  ratio  and  over  4  orders  of  magnitude 
for  the  computer  memory  ratio. 
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From  the  Berenger  PML  ABC  to  Micro-Lasers: 

Recent  Advances  in  FD-TD  Modeling  Techniques' 

Allen  Taflove 
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Northwestern  University,  Evanston,  IL  60208 

1  .  INTRODUCTION 

Since  1990,  there  has  been  an  explosion  of  interest  in  the  engineering  electromagnetic  wave 
community  in  direct  solutions  of  the  Maxwell's  curl  equations  on  space  grids  in  either  the  time 
or  frequency  domain.  The  finite-difference  time-domain  (FD-TD)  method,  introduced  by  Yee 
in  1966  [1],  has  received  perhaps  the  most  attention  during  this  period  because  of  its  simplicity 
and  robustness.  Recent  advances  in  FD-TD  modeling  techniques  have  further  improved  its 
modeling  accuracy  and  expanded  its  range  of  applications.  These  advances  are  succinctly 
summarized  in  this  review  paper: 

1 .  Berenger  PML  absorbing  boundary  condition; 

2.  Dispersive,  nonlinear,  and  gain  material  models; 

3 .  Active  circuit  device  models; 

4.  Planar  unstructured  meshes;  and 

5 .  Software  development  for  massively  parallel  computers. 

2.  BERENGER  PML  ABSORBING  BOUNDARY  CONDITION 
2.1  Unbounded-Region  Scattering  Problems 

As  RCS  mcasuremenLs  have  become  more  sophisticated  and  attained  dynamic  ranges  than  70 
dB,  it  has  become  important  to  extend  the  accuracy  range  of  FD-TD  numerical  modeling  to 
balance  theory  and  measurements.  However,  attainment  of  70  dB  dynamic  range  requires 
suppression  of  computational  noise  to  amplitudes  less  than  10"  times  the  incident  wave.  This 
has  been  a  very  difficult  challenge,  especially  in  the  area  of  absorbing  boundary  conditions 
(ABCs)  that  simulate  the  non-reflective  action  of  the  walls  of  an  anechoic  chamber. 

Until  late  1994,  the  principal  ABCs  used  in  FD-TD  codes  were  published  originally  by 
Mur  [2]  and  Liao  [3].  These  ABCs  provided  effective  outer-boundary  reflection  coefficients  in 
the  range  of  0.5%  -  5.0%  for  most  FD-TD  simulations.  To  obtain  simulations  having  dynamic 
ranges  comparable  to  those  of  recent  anechoic  chambers,  a  reduction  of  >40-dB  (>100:1)  has 
been  needed  in  these  reflectivities.  After  more  than  a  decade  of  only  incremental  progress  in 
ABC  theory,  it  was  becoming  clear  that  this  would  require  a  fundamental  advance. 


'This  paper  is  a  condensation  of  the  invited  paper,  "Advances  in  Finite-Difference  Time-Domain  (FD-T15) 
Numerical  Modeling  Techniques  for  Maxwell's  Equations,"  presented  at  the  Ninth  International  Conference  on 
Antennas  and  Propagation  (ICAP’95).  Eindhoven  University  of  Technology,  The  Netherlatids,  April,  1995. 
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Such  an  advance  appears  to  be  at  hand  with  Berenger's  recent  publication  of  the  novel 
"perfectly  matched  layer"  (PML)  ABC  for  2-D  FD-TD  meshes  [4],  which  provides  orders-of- 
magnitude  improved  performance  relative  to  earlier  techniques.  PML  is  based  upon  a  splitting 
of  E  or  H  components  in  the  absorbing  boundary  region  with  the  possibility  of  assigning 
losses  to  the  individual  split  field  components.  The  net  effect  of  this  is  to  create  a  non-physical 
absorbing  medium  adjacent  to  the  outer  FD-TD  mesh  boundary  that  has  a  wave  impedance 
independent  of  the  angle  of  incidence  and  frequency  of  outgoing  scattered  waves.  Bcrcngcr 
reported  reflection  coefficients  for  the  PML  ABC  as  low  as  l/3000th  that  of  the  Mur  or  Liao 
ABC’s  when  using  a  quadratically-graded  PML  loss  profile.  Katz  et  al  [5]  confirmed  these 
remarkable  claims  and  extended  Berenger's  ABC  to  3-D  open-region  FD-TD  simulations. 

2.2  Effect  of  PML  Grading  Order 

Reuter  and  Taflove  [6]  reported  how  the  grading  of  the  electric  and  magnetic  loss  with  depth  in 
the  PML  affecLs  its  performance.  Their  numerical  procedure  was  identical  to  that  of  [4,  5]  with 
the  exception  that  the  grading  of  the  PML  loss  was  specified  to  be  either  order- 1  (linear),  ordcr- 
2  (quadratic,  Berenger's  baseline),  order-3  (cubic),  order-4,  or  order-5  for  any  desired  PML 
thickness  and  normal-incidence  reflectivity  parameter,  R(0). 

It  was  found  that  the  optimum  grading  of  the  PML  loss  is  generally  not  quadratic  and 
depends  upon  the  PML  thickness.  For  example,  for  a  thickness  of  16  cells,  fourth-order  PML 
loss  grading  yields  local  reflections  at  -55  dB  relative  to  quadratic  PML  grading  in  2-D,  and 
-48  dB  relative  to  quadratic  grading  in  3-D.  The  resulting  optimized  PML  ABC  is  locally 
300,000  times  less  reflective  than  the  second-order  Mur  ABC,  and  30,000  times  less 
reflective  than  the  third-order  Liao  ABC. 

2.3  Waveguide  Problems 

FD-TD  is  increasingly  being  used  to  model  the  propagation  of  waves  in  microwave  and  optical 
circuits.  A  key  problem  here  is  the  accurate  termination  of  guided-wave  structures  extending 
beyond  the  FD-TD  grid  boundaries.  The  difficulty  arises  because  propagation  in  a  waveguide 
can  be  multimodal  and  dispersive,  and  the  ABC  utilized  to  terminate  the  waveguide  must  be 
able  to  ab.sorb  energy  having  widely  varying  transverse  disuibutions  and  group  velocities,  v^. 

When  applied  to  terminate  guided  wave  structures,  typical  ABC's  developed  for  free-spacc 
problems  perform  best  for  narrowband  energy  propagation  where  is  well  defined.  Recently, 
these  ABC's  have  been  specialized  to  account  for  variations  of  the  waveguide  modal  with 
frequency,  for  example,  Bi  et  al  [7]  and  Moglie  et  al  [8].  However,  these  methods  are  not 
completely  satisfactory,  being  cither  approximate  or  computationally  intensive. 

Reuter  et  al  [9]  applied  Berenger’s  PML  ABC  to  terminate  FD-TD  models  of  general  2-D 
waveguiding  structures  for  transverse  magnetic  (TM)  modes.  The  first  application  was  an  air- 
filled  perfectly  conducting  (PEC)  parallel-plate  waveguide  excited  by  a  wideband  carrier  pulse 
which  launched  a  TMj  mode  towards  the  PML  termination.  The  group  velocity  of  the  pulse 
spectral  components  ranged  from  zero  at  to  «  0.98c  at  5/,„,„fr.  Using  a  16-cell 
quadratically-graded  PML,  reflections  between  -70  dB  and  -95  dB  were  calculated  at  all 
frequencies  between  these  two  points. 

Reuter  et  al  [9]  then  applied  the  PML  ABC  to  terminate  the  FD-TD  model  of  a  2-D  micron- 
scale  dielectric  film  optical  waveguide.  A  femto.second-regime  pulse  modulating  an  optical 
carrier  was  used  to  launch  three  distinct  modes  having  widely-varying  frequency-dependent 
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propagation  factors.  The  system  was  terminated  by  extending  the  air,  film,  and  substrate  layers 
into  matching  16-cell  thick,  quadratically-graded  PML  absorbers.  FD-TD  calculations  showed 
a  composite  reflectivity  below  -80  dB  for  the  PML  ABC  across  the  entire  incident  spectrum. 

These  results  showed  that  the  PML  ABC  has  a  broadband  effectiveness,  robustness,  and 
computational  efficiency  unmatched  by  previous  ABC’s  for  FD-TD  waveguide  models.  PML 
is  local  in  space/timc  and  requires  no  knowledge  of  modal  field  distributions,  multimoding,  or 
dispersion  characteristics  of  the  guided  wave.  Extension  to  3-D  PEC  and  dielectric  waveguide 
models  is  straightforward.  Another  useful  application  is  for  FD-TD  modeling  of  problems 
involving  the  earth-air  interface,  in  fact  a  subset  of  the  three-layer  dielectric  geometry  of  [9]. 

3.  DISPERSIVE,  NONLINEAR,  AND  GAIN  MATERIAL  MODELS 

The  usage  of  short  electromagnetic  pulses,  whether  in  radar  applications  or  in  lasers/nonlinear 
optics,  requires  understanding  of  the  nature  of  pulse  interactions  with  materials  over  wide 
bandwidths.  In  the  case  of  high-power  microwave  or  laser  engineering,  the  pulses  are  likely  to 
have  a  sufficiently  high  intensity  such  that  material  nonlinearity  can  also  play  an  important  role. 
Overall,  the  key  factors  in  short-pulse  physics  arc  material  dispersion,  nonlinearity,  and  gain. 

Two  recent  advances  in  FD-TD  computational  technology  permit  effective  modeling  of 
these  material  properties  at  the  macroscopic  (phenomenological)  level.  The  first  is  the  recursive 
convolution  (RC)  method,  a  highly  efficient  approach  to  model  complicated  linear  dispersions 
consisting  of  an  arbitrary  number  of  Debye  and  Lorentzian  relaxations.  The  second  is  the 
auxiliary  differential  equation  (ADE)  method,  which  permits  modeling  of  nonlinearities  and 
dispersive  nonlinearities  in  addition  to  linear  dispersions  at  the  cost  of  an  increase  in 
computational  complexity  relative  to  the  RC  approach.  The  ADE  method  can  also  model  linear 
and  nonlinear  dispersive  gain  media,  such  as  those  found  in  lasers. 

3.1  Recursive  Convolution  Method 

Luebbers  et  al  [lOJ  reported  an  efficient,  "on-the-fly"  recursive  convolution  (RC)  approach  to 
model  electromagnetic  wave  interactions  with  linear  dielectric  materials  having  combinations  of 
multiple  Debye  and  Lorentzian  dispersions.  Here,  a  basic  assumption  is  that  the  variation  of 
the  electric  susceptibility,  frequency  is  given  by  a  general  rational  function 

expression.  After  a  partial-fraction  expansion,  the  corresponding  time-domain  susceptibility 
function,  xAO^  is  obtained  by  inverse  Fourier  transformation,  yielding  a  finite  sum  of 
exponentially  decaying  sinusoids  and  simple  exponentials.  Upon  substitution  into  the 
differential  form  of  Ampere's  Law,  the  resulting  displacement  current  term  is  represented  by  a 
corresponding  sum  of  convolutions  of  the  decaying  exponential  functions  with  E. 

Luebbers  et  al  f  10]  made  the  key  observation  that,  in  a  numerical  realization  of  each  of  the 
convolutions  as  a  discrete  summation,  all  but  the  last  term  of  the  sum  could  be  written  in  terms 
of  the  previous  time  step's  evaluation  of  the  sum  multiplied  by  a  constant  exponential  decay 
factor.  Therefore,  the  summation  over  past  E  values  could  be  evaluated  recursively  without 
having  to  store  the  past  fields  and  without  having  to  redo  the  complete  sum  every  time  step. 

This  procedure  leads  to  great  efficiency  in  computer  resources.  For  each  term  used  in 
XA^)y  one  additional  storage  variable  is  required  for  each  electric  field  component  at  each 
grid  cell.  This  variable  is  real  if  the  corresponding  frequency-domain  pole  is  first-order,  and 
complex  if  the  pole  is  second-order.  On  the  arithmetic  side,  the  addition  of  a  pole  requires  only 
the  extension  of  each  of  the  summations  by  a  single  term.  Because  an  arbitrary  susceptibility 
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function  can  be  expanded  in  a  series  of  complex  exponentials  using  Prony's  or  similar  methods 
(yielding  a  set  of  real  and  complex  poles),  a  material  having  a  complicated  dispersion  can  be 
modeled  simply  and  efficiently  from  both  a  computer  storage  and  algorithm  viewpoint. 

There  are  two  problems,  however,  with  the  RC  approach  of  [10].  First,  the  discrete 
convolutions  arc  only  first-order  accurate  in  the  space  /  time  discretization  due  to  their 
rectangular-rule  realization  of  the  underlying  continuous  integrals.  Second,  the  RC  approach 
cannot  be  used  for  materials  that  are  simultaneously  dispersive  and  nonlinear  because 
convolution  is  based  upon  the  a  linear  superposition  integral.  This  prevents  the  application  of 
the  RC  method  to  model  the  important  class  of  nonlinear  electro-optical  materials. 

3.2  Auxiliary  Differential  Equation  Method 

Kashiwa  and  Fukai  [11]  and  Joseph  et  al  [12]  described  an  auxiliary  differential  equation 
(ADE)  method  for  FD-TD  modeling  of  material  dispersions  based  upon  applying  the  inverse 
Fourier  transform  to  the  constitutive  relation  between  D{q))  and  E{co).  This  provided  a  time- 
domain  ordinary  differential  equation  (ODE)  relating  D{t)  and  E{t)  that  could  be  time-manched 
in  parallel  with  the  Yee  algorithm.  While  useful  and  accurate  for  a  single  Lorentzian 
dispersion,  this  approach  proved  difficult  to  systematically  extend  to  multiple  relaxations. 
However,  as  discussed  by  Taflove  [13],  it  is  possible  to  refine  this  approach  for  a  material 
having  an  arbitrary  number  of  relaxations  to  yield  a  system  of  low-order  ODE's.  No  inverse 
Fourier  transformation  is  needed,  and  only  one  ODE  is  generated  per  dielectric  relaxation. 

Consider  a  material  dispersion  characterized  by  M  Lorentzian  respon.ses.  As  noted  earlier, 
the  polarization  of  the  E  components  can  be  expressed  as  a  sum  of  M  convolution  integrals. 
However,  the  key  property  that  drives  the  ADE  formulation  is  that  each  convolution  kernel 
function,  Xe(0r  satisfies  a  linear,  second-order  ODE.  This  property  makes  it  possible  to  treat 
each  of  the  M  convolution  integrals  as  a  new  dependent  variable  which  satisfies  a  second-order 
ODE  in  time.  In  turn,  this  yields  a  system  of  M  coupled  second-order  ODE's  that  evolves  the 
time  response  of  the  M  convolution  integrals.  Central-difference  time  integration  of  this 
system  provides  a  second-order  accurate  calculation  of  the  composite  polarization,  which  is 
then  used  to  time-advance  E. 

As  shown  in  [13],  the  ADE  approach  provides  superior  convergence  and  accuracy  relative 
to  the  RC  method  of  [10]  for  materials  with  multiple  Lorentzian  relaxations.  However,  the 
computational  burden  of  the  ADE  method  is  substantially  greater  than  that  of  the  RC  technique 
because  of  the  need  to  solve  a  system  of  M  equations  rather  merely  sum  M  terms. 

3.3  Application  of  the  ADE  Method  to  Nonlinear  Dispersive  Media 
(Nonlinear  Optics) 

The  ADE  method  maintains  unique  capabilities  relative  to  the  RC  approach  in  FD-TD  modeling 
of  electromagnetic  fields  in  nonlinear  dispersive  dielectrics.  As  discussed  in  Goorjian  and 
Taflove  [14],  the  ADE  method  can  model  a  dielectric  having  multiple  frequency-dependent 
nonlinear  as  well  as  linear  relaxations.  The  key  is  again  the  treatment  of  the  convolution 
integrals.  Because  each  kernel  function  satisfies  a  second-order  ODE,  the  convolution  integrals 
can  again  be  treated  as  dependent  variables  which  in  this  case  satisfy  a  system  of  coupled, 
nonlinear,  second-order  ODE's.  Using  central-differencing  in  time,  this  system  is  evolved  to 
provide  the  linear  and  nonlinear  components  of  the  dielectric  polarization. 
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The  ADE  approach  has  permitted  the  initial  FD-TD  modeling  of  temporal  optical  soliton 
formation  and  propagation  in  1-D  half-spaces  (Goorjian  and  Taflove  [14])  and  in  2-D  dielectric 
waveguides  (Joseph  et  al  [15]).  Joseph  and  Taflove  also  reported  spatial  optical  soliton 
formation,  propagation,  and  mutual  deflection  in  2-D  homogeneous  media  [16],  and 
Ziolkowski  and  Judkins  reported  self-focusing  of  short  optical  pulses  [17]  and  wide-angle 
scattering  of  optical  pulses  propagating  within  nonlinear  corrugated  structures  [18]. 

3.4  Application  of  the  ADE  Method  to  Dispersive  Gain  Media 
(Active  Lasing  Media) 

Hagness  and  Taflove  [19]  reported  a  new  approach  based  upon  the  ADE  method  that  permits 
wideband  modeling  of  the  Lorentzian  frequency-dispersive  gain  found  in  a  homogeneously 
broadened  two-level  lasing  system.  This  method  provided  second-order  accuracy  and 
numerical  stability  extending  over  hundreds  of  thousands  of  time  steps.  Relative  to  exact 
solutions  for  optical  wave  propagation  in  1-D,  the  frequency-dependent  gain  calculated  by  this 
method  had  worst-case  magnitude/  phase  errors  of  only  0.01%  -1%  for  FD-TD  grid 
resolutions  between  A^/400  and  A^/ 40,  respectively. 

It  is  believed  possible  to  expand  the  range  of  physics  modeled  by  the  ADE  formulation  of 
FD-TD  to  include  multiple  Lorentzian  resonances  of  the  gain,  frequency-dependent  nonlinear 
gain  saturation  effects,  and  simultaneous  multiple  Lorentzian  relaxations  of  the 
linear  /  nonlinear  dielectric  susceptibility.  Success  in  this  research  would  mean  a 
comprehen.sive  phenomenological  modeling  tool  for  the  full-wave  pulse  dynamics  of  micron- 
scale  semiconductor  lasers. 

3.5  The  First  FD-TD  Laser  Oscillator  Model 

Following  the  above  theme,  Hagness  and  Taflove  [19]  also  reported  what  they  believe  to  be  the 
first  FD-TD  model  of  a  la.ser  oscillation  originating  from  low-level  Gaussian  noise  in  the  laser 
cavity.  Here,  their  Lorentzian  gain  model  was  modified  to  include  a  typical  gain  saturation 
function.  This  was  applied  to  a  1-D  microcavity  laser  model  spanning  about  5  microns. 

In  the  steady  state,  their  computed  laser  output  waveform  was  a  pure  sinusoid  having  a 
frequency  corresponding  to  the  cavity  mode  located  at  the  peak  of  the  Lorentzian  gain  spectrum 
of  the  lasing  medium.  The  FD-TD  model  properly  rejected  multimode  oscillations  at  other 
above-threshold  cavity  resonances  having  frequencies  off  the  peak  of  the  gain  curve.  The 
proper  lasing  mode  was  selected  preferentially  by  the  FD-TD  simulation  due  to  the  action  of  the 
assumed  gain-saturation  nonlinearity. 

Two  other  agreements  with  accepted  results  reinforced  the  validity  of  the  FD-TD  numerical 
laser  model  of  [  19]:  (a)  The  calculated  tum-on  delay  of  the  numerical  laser  from  the  Gaussian 
noise  seed  increased  properly  as  the  peak  magnitude  of  the  Lorentzian  gain  function  was 
reduced  to  the  lasing  threshold  of  the  primary  cavity  mode;  (b)  The  FD-TD  model  properly 
calculated  a  linear  increase  in  the  output  inten.sity  of  the  laser  as  the  peak  value  of  the  Lorenztian 
gain  function  increased  beyond  the  lasing  threshold  of  the  primary  cavity  mode. 

3.6  Comment 

It  should  be  noted  that  the  u.sc  of  the  full-wave  FD-TD  Maxwell's  equations  approach  to  model 
the  pulse  dynamics  of  dispersive,  nonlinear,  and  gain  media  is  novel.  The  optics  community 
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has  routinely  made  paraxial  and  slowly- varying  envelope  approximations  that  result  in  the  class 
of  generalized  nonlinear  Schrodinger  equations  (GNLSE),  as  in  Agrawal  [20].  The  least 
approximate  methods  for  GNLSE  solve  nonlinear  scalar  equations  for  the  envelope  of  a 
propagating  optical  pulse,  discarding  the  sinusoidal  carrier.  Examples  include  the  split-step 
Fourier  method  (used  to  simulate  propagation  of  optical  pulses  in  low-loss  fibers  over  very 
long  optica]  distances),  and  the  propagating  beam  method  (used  to  model  directional  couplers). 

Relative  to  such  approaches,  FD-TD  achieves  robustness  by  directly  solving  Maxwell's 
equations  for  fundamental  quantities  (the  E  and  H  fields  in  space  and  time),  rather  than  using 
asymptotic  and  paraxial  approximations  and  calculating  nonphysical  envelope  functions. 
FD-TD  permits  a  rigorous  treatment  of  optical  structures  having  features  comparable  in  size  to 
the  wavelength,  a  key  advantage  relative  to  previous  modeling  tools  in  the  optics  community. 

4 .  ACTIVE  CIRCUIT  DEVICE  MODELS 

Thomas  et  al  [21]  reported  that  the  lumped-circuit  behavior  of  linear  and  nonlinear  active 
devices  can  be  directly  incorporated  into  a  generalized  3-D  FD-TD  Maxwell’s  equations 
solution.  Here,  the  circuit  simulator,  SPICE,  was  linked  to  FD-TD  so  that  SPICE  would  time- 
step  Ampere's  Law  at  grid  locations  where  a  lumped-circuit  element  was  specified.  In  this 
way,  the  lumped  element  could  be  an  arbitrarily  large  circuit  having  a  description  contained  in  a 
standard  SPICE  file.  Thus,  all  of  the  extensive  device  models  in  SPICE  could  be  used  directly 
in  the  FD-TD  simulation  without  the  need  to  duplicate  the  model  development.  Further,  the 
efficient  circuit  integration  methods  used  in  SPICE  would  be  directly  available  without  any 
need  for  user-implemented  integration  schemes. 

Thomas  et  al  [21,  22]  reported  succc.ssful  applications  of  the  hybrid  FD-TD/SPICE 
modeling  approach.  In  [21],  the  model  involved  a  slriplinc-mounted  VHF  tuned  amplifier 
consisting  of  a  single  NPN  bipolar  junction  transistor  provided  with  inductor-capacitor 
networks  for  base  and  collector  impedance  matching.  In  [22],  the  model  involved  a  coupled 
pair  of  patch  antennas  excited  by  locally-mounted  Gunn  diodes.  Very  good  agreement  was 
obtained  relative  to  benchmark  data  for  voltages,  currents,  and  fields  in  both  cases. 

The  hybrid  FD-TD/SPICE  tool  will  be  optimally  applied  when  the  speed  of  a  circuit  is  so 
high  and  iLs  physical  embedding  is  so  complex  that  it  is  crucial  to  model  electromagnetic  wave 
"artifacts."  A  wide  range  of  digital  applications  is  expected  as  clock  speeds  approach 
microwave  frequencies.  Analog  applications  will  include  analysis  of  linearity,  intermodulation, 
harmonic  generation,  and  conversion  efficiency  of  microwave  and  millimeter  wave  integrated 
circuits.  Another  category  of  applications  will  include  radiation,  especially  by  arrays  of  patch 
antennas  excited  by  semiconductor  devices  located  directly  at  the  antenna  [22].  FD-TD/SPICE 
should  also  be  useful  in  modeling  circuit  upset  due  to  external  electromagnetic  fields  generated 
by  lightning,  electromagnetic  pulse,  and  high-power  microwaves. 

5.  PLANAR  UNSTRUCTURED  MESHES 

The  computational  requirements  of  a  general  unstructured  3-D  FD-TD  algorithm  can  be  greatly 
reduced  by  exploiting  symmetries  in  the  model.  As  discussed  by  Gedney  and  Lansing  [23],  an 
important  opportunity  of  this  type  arises  for  digital  and  microwave  printed  circuits  which  have 
planar  symmetry.  Such  circuits  can  be  uniquely  described  by  a  projection  onto  a  2-D  plane. 
Here,  the  FD-TD  grid  used  to  analyze  the  3-D  problem  can  be  described  by  an  unstructured 
2-D  grid  in  a  transverse  plane  and  as  a  regular  grid  in  the  third-dimension.  Only  the  2-D  grid 
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locations  need  be  stored.  This  greatly  relaxes  the  memory  requirements  of  the  algorithm  to  the 
extent  that  it  is  actually  as  memory  efficient  as  basic  FD-TD. 

Conceptually,  the  grid  of  Gedney's  and  Lansing's  "planar  generalized  Yee  algorithm"  [23] 
can  be  generated  by  extruding  an  x-y  plane  (horizontal)  2-D  unstructured  grid  in  the  z  (vertical) 
direction  and  segmenting  it  at  discrete  z  intervals.  A  secondary  grid  is  staggered  within  this 
primary  grid  such  that  its  vertices  lie  at  the  centroids  of  the  primary  grid  cells.  Further,  the 
edges  of  the  secondary  grid  connect  the  centroids  by  passing  through  the  faces  of  the  primary 
grid.  The  £  and  H  fields  are  then  decomposed  into  orthogonal  components.  Subsequently,  the 
transverse  £  and  H  fields  are  mapped  onto  the  horizontal  edges  of  the  primary  and  secondary 
grids,  respectively.  Likewise,  the  vertical  £  and  H  fields  are  mapped  onto  the  vertical  edges  of 
the  primary  and  secondary  grids,  respectively.  The  fields  are  assumed  to  be  constant  along 
their  respective  edge  lengths  as  well  as  over  the  dual  faces  through  which  they  pass. 

Based  on  this  discretization,  Faraday’s  and  Ampere’s  Laws  are  approximated  by  choosing 
the  surfaces  of  integration  to  be  the  faces  of  the  secondary  and  primary  grids,  respectively. 
This  leads  to  explicit  time-stepping  expressions  for  D,  and  B,  flux  densities  in  the  transverse 
plane  and  field  intensities,  and  Note  that  the  D,  and  B,  fluxes  are  normal  to  the  faces. 
However,  the  corresponding  field  intensities  on  the  dual  edges  passing  through  these  faces  are 
not  necessarily  normal  to  the  faces.  As  a  result,  the  flux  densities  must  be  projected  onto  the 
edges  before  the  dual  fields  can  be  updated.  An  auxiliary  operator  must  be  introduced  to 
perform  this  projection.  Madsen's  general  3-D  projection  scheme  [24]  is  useful  since  the  flux 
projected  onto  the  edges  has  zero  divergence  in  a  charge-free  medium,  and  the  time-stepping 
algorithm  maintains  numerical  stability.  With  the  assumed  planar  symmetry  and  the  resulting 
orthogonality  of  the  vertical  and  transverse  fields,  only  fields  within  the  transverse  plane  are 
needed  for  the  interpolation.  This  results  in  a  simplified  projection  of  the  D,  and  B,  fluxes 
relative  to  the  most  general  3-D  case. 

Gedney  and  Lansing  [23]  (see  also  their  Chapter  11  in  [13])  reported  a  number  of 
application  examples  of  the  planar  generalized  Yee  algorithm  including  32-GHz  Wilkinson  and 
Gysel  power  dividers,  and  signal  lines  and  vias  within  the  IBM  3090  thermal  conduction 
module.  Excellent  computational  accuracy  and  efficiency  was  indicated,  complementing  the 
high  level  of  geometrical  modeling  flexibility  afforded  by  the  unstructuring  of  the  FD-TD  mesh 
in  two  coordinate  dimensions.  This  approach  has  substantial  promise  for  modeling  microwave 
and  digital  circuit  boards  of  great  complexity. 

6  .  SOFTWARE  DEVELOPMENT  FOR  MASSIVELY  PARALLEL  COMPUTERS 

In  Chapter  16  of  [13],  Gedney  and  Barnard  provided  detailed  descriptions  of  their  recent  highly 
efficient  ports  of  unstructured-  and  structured-grid  FD-TD  algorithms  to  the  INTEL  Delta  and 
CRAY  T3D  ma.ssively  parallel  computers.  Barnard  also  addressed  the  issue  of  how  fast  the 
T3D  can  run  the  largest  possible  3-D  FD-TD  model.  His  procedure  involved  measuring  T3D 
performance  while  continually  scaling  the  grid  size  upwards  to  keep  each  processing  element 
(PE)  fully  involved.  He  found  that  the  performance  scaled  nearly  linearly,  reaching  a  projected 
steady  rate  of  50  GFLOPS  on  2,048  PE’s  for  the  most  optimized  code.  Since  a  2,048-PE  T3D 
with  64  MBytes/PE  has  a  total  memory  of  16.384  GWords,  it  would  be  possible  to  run  in-core 
a  two-billion  grid-cell  problem  containing  12-billion  unknown  fields.  This  unprecedented 
combination  of  speed  and  memory  capacity  provides  large  opportunities  for  each  of  the 
emerging  FD-TD  models  reviewed  in  this  paper,  as  well  as  for  traditional  FD-TD  simulations. 
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SUMMARY 

Results  from  the  VOCAR  (variability  of  coastal  atmospheric  refractivity)  experiment,  performed  in  the  southern 
California  coastal  area,  arc  presented  and  compared  with  a  terrain  parabolic  equation  model  called  TPEM.  Both 
homogencons  and  range  dependent  refracti\  ity  environments  are  considered, 

I.  INTRODUCTION 

Much  emphasis  has  been  gi\en  lately  to  radio  field  prediction  in  coastal  environments.  Currently,  well  established 
propagation  models  exist  that  have  been  shown  to  predict,  fairly  accurately,  radio  signals  over  water  [1,  2,  3|.  However, 
in  a  coastal  environment,  when  one  or  both  terminals  are  located  a  short  distance  inland,  the  smooth  earth  assumption 
that  these  models  employ  fails  to  account  for  terrain  elTccts  on  these  propagation  paths. 

An  experiment  was  performed  recently  along  the  sonthern  California  coast  in  which  RF  signals  from  militar>'  and  civilian 
Automatic  Terminal  Information  Scr\  icc  (ATIS)  transmitters  were  received  at  two  locations  on  the  southern  California 
coastline.  Some  of  the  propagation  paths  were  entirely  over  water  while  others  were  partially  over  land.  Many  of  the 
inland  transmitters  were  obstniclcd  by  \ery  high  cliffs  and/or  mountainous  terrain  typical  of  the  California  coastline. 
Even  when  the  propagation  path  was  more  than  90%  over  water,  predictions  based  on  a  smooth  earth  assumption  fail  to 
agree  with  measured  data. 

A  split-step  parabolic  equation  model  that  can  account  for  terrain  effects,  called  TPEM.  has  been  previously  described  (4, 
5]  and  is  used  here  to  investigate  the  signals  that  were  measured  from  transmitters  located  several  kilometers  inland. 


Figure  i.  Tn|)(>graphic:il  map  of  Southern  California  coa.stal  area. 
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2.  EXPERIMENT 

In  1993.  an  experiment  to  characterize  the  variability  of  coastal  atmospheric  refractivity  (VOCAR)  was  performed  in  the 
soiitliern  California  coastal  area.  Signals  from  14  military'  and  civilian  transmitters  located  along  the  coastline  between 
San  Diego  and  Santa  Barbara  ^^cre  constantly  being  measured  by  receivers  on  the  coast  at  the  Naval  Command.  Control 
and  Ocean  Smveil lance  Center,  Research  Development  Test  and  Evaluation  Division  in  San  Diego  (NRaD)  and  at  the 
Naval  Air  Warfare  Center  Weapons  Division  in  Point  Miigii  {NAWCWPNS).  Additiomil  iranstnitters  were  placed  on 
San  Clemente  Island  and  were  also  received  at  San  Diego  and  Point  Miigii. 


During  an  intense  measurement  period  between  August  23  and  September  3,  1993,  signals  were  constantly  being  received 
over  these  land/water  propagation  paths.  Obscn’ations  taken  over  a  few  of  the  paths  will  be  presented.  The  propagation 
paths  and  the  surrounding  area  are  shown  in  Figure  I,  The  topography  information  shown  in  this  figure,  and  the  terrain 
elevation  information  u.sed  by  TPEM  for  all  paths  discussed  in  this  paper,  is  from  the  Digital  Terrain  Elevation  Data 
(DTED)  database  provided  by  the  Defense  Mapping  Agency.  Both  receivers  at  NRaD  and  at  NAWCWPNS  were  located 
at  30  5  m  above  mean  sea  level.  Below  is  a  table  of  the  operating  frequencies  and  antetma  heights  (above  mean  sea  level) 
of  the  transmitters  located  inland. 


Location 

Frequency 

(MHz) 

Antenna 
Height  (m) 

Long  Beach  Airport 

127.7.S 

17.1 

John  Wayne  Aiiyiorl 

126.0 

24  4 

San  Clemente  Island 

268.6 

66,7 

Tabic  I  Location,  frequency,  and  antenna  heights  of 
ATIS  transmitters  shown  in  Fig.  1. 


Radiosondes  were  launched  approximately  4  to  5  times  daily 
at  North  Island  in  San  Diego.  Point  Miigu,  San  Clemente 
Island,  Camp  Pcndielon  and  Point  Vicente. 

3.  RESULTS 

The  first  path  considered  is  that  from  San  Clemente  Island  to 
San  Diego.  Signals  were  received  from  a  transmitter  located 
at  the  Naval  Auxilary'  Landing  Field  (NALF)  on  the  island 
approximately  two  kilomclcrs  inland.  A  small  mountain 
peak,  roughly  137  m  high  and  located  between  NALF  and 
the  shore,  lay  in  direct  line  from  the  transmitter  to  the 
receiver  at  NRaD.  The  propagation  path  is  127  km  long  and 
although  most  of  the  path  is  over  water,  the  presence  of  this 
peak  created  a  substantial  reduction  in  signal  received  at  San 
Diego 


August-September  1993 

Figure  2  TPEM  and  RPO  re.sults  vs.  measurements  for 
San  Clemente  Island  to  San  Diego  path  using  North 
Island  soundings. 


Radiosondes  taken  at  North  Island  were  used  as  refractivity  inputs  to  RPO  [3]  and  TPEM.  RPO  is  a  hybrid  ray  optics/PE 
model  that  assumes  smooth  earth  and  docs  not  account  for  terrain.  Tlie  results,  along  with  observations  for  the  ten  day 
period  in  August  and  Scplcinbcr.  are  shown  in  Figure  2.  RPO  underestimated  the  propagation  loss  by  roughly  20  dB  and 
in  some  instances,  by  as  much  as  40  dB,  wliereas,  TPEM  showed  very  good  agreement.  The  free  space  and  troposcatlcr 
thresholds  are  shown  for  reference.  The  iroposcatter  threshold  was  determined  based  on  the  model  by  Yehl6].  Between 
August  29  and  September  3  signal  levels  were  generally  too  low  to  be  detected,  and  during  this  time  period  TPEM 
predicted  very’  low  signal  levels. 


For  the  John  Wayne  Aiqiort  to  San  Diego  propagation  path  there  is  a  substantial  mountain  range,  roughly  300  tn  at  its 
peak,  from  3  km  out  to  1.3  km  away  from  the  transmitting  antenna.  The  remainder  of  the  path  is  over  water  with  the 
entire  path  being  123  km  long.  Refractivity  profiles  measured  throughout  the  ten  day  period  at  Camp  Pendleton  were 
used  as  environmental  inputs  to  RPO  and  TPEM.  A  comparison  of  their  predictions  against  measured  data  is  shown  in 
Figure  3. 
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Figure  3.  TPEM  and  RPO  ix-sults  vs.  infasurements  Figure  4  TPEM  and  RPO  results  vs.  measurements  fur 
for  John  Wayne  to  San  Diego  juith  using  Camji  John  Wayne  Airport  to  Point  Mugu  path  using  Point  Mugu 
Pendleton  soundings.  soundings. 


A  substantial  rcduclioii  in  signal  le\el  due  to  tiie  mountain  range  is  apparent.  Again.  RPO  underestimates  tlic  losses  by 
appro.ximalcly  It)  to  2t)  dB  while  TPEM  shows  very  good  agrecnicnl.  The  diffract  ion  threshold  in  this  case  was 
determined  based  on  a  standard  atmosphere  environment  over  the  same  terrain  path. 


Refractivily  profiles  measured  at  Point  Mngn  were  used  for  the 
Mugu  and  Long  Beach  to  Point  Mugu.  The  terrain  path 
from  John  Wayne  Airport  to  Point  Mugu  is  roughly  129  km. 
The  path  begins  over  the  Los  Angeles  basin  area  and  the 
remainder  of  llic  path  is  over  water  with  some  small 
mountain  peaks  near  the  receiving  end  at  Point  Mngii. 
Figure  4  shows  RPO  and  TPEM  results,  along  with 
obsenations.  for  this  patli.  The  terrain  profile  did  not 
consist  of  verv  large  obstructions  and  was  relatively  smooth, 
with  most  terrain  elevation  feniures  having  a  height  of  50  m 
or  less.  One  would  c.xpcct  tliat  a  smooth  earth  assumption  in 
this  case  would  be  adequate  and  that  results  from  RPO  would 
be  in  close  agrcemcnl  with  observations  In  fact,  RPO  and 
TPEM  results  diflcr  by  only  5  to  10  dB  However,  in 
comparison  with  measurements.  TPEM  still  gives  a  much 
belter  match  to  the  data. 

Tlic  path  from  Long  Beach  Aiqxvrl  to  Point  Mugu,  a 
distance  of  100  km,  starts  witli  fairly  smootli  terrain  but 
contains  some  high  coastal  mountain  peaks  near  the 
receiving  end  at  Point  Mngu.  Results  arc  shown  in  Figure  5. 
Here  again,  the  presence  of  these  peaks  causes  a  signincant 
reduction  in  received  signal  TPEM  shows  very  good 
agreement  with  obscnation.s.  wliilc  RPO  differs  by  as  much 
as  20  to  .10  dB. 


propagation  paliis  from  Jolm  Wayne  Airport  to  Point 


August-September  1993 

Figure  5.  TPEM  and  RPO  rc.sult.s  v.s.  mca.surcmcnt,s 
for  Long  Beach  to  Point  Mugu  path  u.sing  Point  Mugu 
sountling.s. 
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Figure  fi.  Distrilmfion  tor  San  Clemente  Island  to  San  Figure  7.  Distribution  for  John  Wayne  Air|)ort  to  San 
Diego  path  for  homogeneous  and  range  dependent  Diego  path  for  homogeneous  and  range  dependent 

re1'raeti^ities.  refractivilies. 

Clearly,  for  all  of  the  propagatioo  paths  discussed  in  Figs.  2-5,  the  terrain  had  a  significant  elTect  on  the  field.  The 
smooth  eartlt  assumption  results,  gi\'cn  by  RPO,  did  not  adequately  model  what  was  observ’ed.  Also,  since  the  signal 
levels  measured  were  usually  \^■cll  above  troposcatter  and  dilTractioa  levels,  this  indicates  the  rcfractivity'  had  a  major 
effect  on  the  field  as  well.  The  rcfractivity  profiles  used  in  Figs.  2-5  were  taken  from  one  site  only  and  were  cither  at  the 
receiving  end  of  or  midway  along,  the  path,  and  the  atmospheric  environment  for  these  ctises  vv:is  assumed  to  be 
homogeneous.  Rcfractivity  measurements  were  taken  at  more  titan  one  location  along  each  path.  The  atmosphere  can 
change  drastically  at  land/sca  boundaries  where  anamolous  conditions  are  attributable  to  very  different  mechanisms  o\er 
land  than  over  water.  How  much  improvement,  if  any,  would  there  be  between  predicted  and  measured  fields  if  range 
dependent  rcfractivity  environments  were  considered? 

Results  from  TPEM  for  the  San  Clemente  Island  to  San  Diego  path  for  homogeneous  and  range  dependent  environments 
are  shown  in  Figure  6.  Only  measuremenls  and  predictions  for  the  first  six  days  were  included  in  this  distribution  due  to 
the  fact  that  data  over  the  last  half  of  the  measureincnt  period  were  spurious  or  iiouexisteat.  The  measured  data  is  shown 
as  a  solid  line  and  all  information  is  given  in  terms  of  the  percentage  of  time  in  which  the  propagation  loss  exceeds  the 
abscissa  value.  The  North  island  cuiwe  represents  results  from  TPEM  assuming  homogeneous  environments  based  on 
soundings  at  North  Island  (this  corresponds  to  Fig.  2).  Similarly,  the  San  Clemente  Island  citrx'c  represents  results  from 
TPEM  assuming  homogenous  cuviromuents  based  on  soundings  measured  at  San  Clemente  Island.  Lastly,  soundings 
from  both  of  these  locations  taken  at  roughly  the  same  lime  of  day  (no  further  than  30  mimites  apart)  were  used  for  the 
range  dependent  result.  The  results  given  by  the  San  Clemente  Island  soundings  produced  the  best  agreement  with 
obscivatioiis,  pariicnlarly  at  the  iiighcr  loss  values.  In  fact,  the  range  dependent  curxe  gives  the  worst  agreement  overall. 
This  is  a  puzzling  result  and  this  case  w  ill  be  looked  at  in  more  detail  in  the  next  section. 

For  the  path  from  John  Wayne  Airport  to  San  Diego,  results  from  homogeneons  environments  based  on  North  Island  and 
Camp  Pendleton  soundings  agreed  w  ith  measurements  Just  as  well  as  those  using  the  range  dependent  environments,  as 
shown  by  Figure  7.  The  range  dependent  environment  for  this  path  assniiied  horizontal  homogeneity  between  John 
Wayne  Airport  and  Camp  Pendleton  based  on  Camp  Pendleton  soundings,  and  then  range  dependent  rcfractivity  between 
Camp  Pendleton  ;md  S;m  Diego  based  on  Camp  Pendleton  and  Nortli  Island  soundings. 
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Pfopagalion  Loss  (dB)  Propagation  Loss  (dB) 

Fijjurc  8.  Distnliutioii  for  Joint  VViiyne  Airport  to  Figure  9.  DiKtribution  for  Long  Beach  to  Point 

Point  Mugii  path  for  homogeneous  and  range  Miigu  jtath  for  homogeneous  and  range  dependent 

dependent  rcfractivities.  rcfractivities. 

In  Figure  8.  results  using  range  dependent  environments  gave  slightly  belter  agreement  at  the  lower  loss  values  for  the 
John  Wa)  ne  Airport  to  Point  Mugu  path.  The  two  cui^es  representing  homogeneous  rcfractivity  based  on  Point  Vicente 
and  Point  Mugu  soundings  performed  ecpially  well,  thougli  they  do  not  match  the  measured  data  very  closely  at  the  lower 
loss  values.  Here,  the  range  dependent  results  assumed  horizontal  homogeneity  from  John  Wayne  Airport  to  Point 
Vicente  based  on  Point  Vicente  soundings,  then  range  dependent  rcfractivity  from  Point  Vicente  to  Point  Mugu  based  on 
their  respective  soundings. 

Figure  9  shows  the  distribution  cur\cs  for  the  Long  Beach  to  Point  Mugu  path.  As  in  Fig.  8,  the  range  dependent  result 
assnined  horizontal  homogeneity  from  Long  Beach  to  Point  Vicente,  then  treated  the  environment  as  range  dependent 
from  Point  Vicente  to  Point  Mugu.  Tlie  range  dependent  curve  shows  e.xcellent  agreement  at  the  lower  loss  values. 

4.  DISCUSSION 

In  Figs.  6-9.  tlie  assumption  of  a  liomogencous  rcfractivity  environment,  using  whatever  soundings  were  available,  gave 
fairly  good  agreement  with  obsenations.  For  three  of  the  four  propagation  paths,  little  dincrcnce  is  seen  in  the 
distribution  results  when  using  a  sounding  midway  or  at  one  end  of  the  path.  A  slight  improvement  occurred  in  only  two 
of  the  paths.  Figs.  8  and  9,  wlicn  using  range  dependent  rcfractivity  environments.  Ti)e  most  puzzling  result  was  in  the 
San  Clemente  Island  to  San  Diego  path  (Fig,  6)  where  there  was  much  belter  agreement  with  data  when  using  soundings 
from  San  Clemente  Island  tlian  from  North  Island.  Also,  applying  range  dependent  environments  showed  the  worst 
match.  This  leads  one  to  assume  tliat  other  ineclianisms  may  be  involved.  Therefore,  it  is  worth  taking  a  second  look  at 
this  case. 

On  the  San  Clemente  Island  to  San  Diego  path  there  were  rcfractivity  measurements  available  al  botli  terminal  locations 
(altliough  technically  the  radiosonde  measnrcmciU  site  at  Nortli  Island  is  located  roughly  3  km  from  the  receiver  site,  for 
most  practical  purposes,  the  rcfraclivity  measured  can  be  considered  representative  of  that  at  the  receiving  antenna). 
Also,  this  is  the  only  path  in  which  one  of  the  terminal  antennas  was  located  away  from  the  mainland.  It  was  found  that, 
over  the  ocean,  the  assumption  of  a  horizontally  stratified  troposphere  led  to  valid  propagation  assessments  86%  of  the 
time  [7j.  While  the  troposphere  over  tlic  sea  docs  e.vhibit  horizontal  homogeneity'  over  relatively  long  distances  in  most 
cases,  meteorological  conditions  occur  occasionally  in  whicli  the  environment  may  change  drastically  in  just  a  few 
kilometers,  such  as  al  air-mass  boundaries  associated  with  land/ocean  interfaces.  To  test  this  theory,  the  predictions 
based  on  the  range  dependent  environments  will  be  repealed,  however,  the  environments  will  be  "weigtued”  in  such  a 
way  as  lo  make  the  rcfracliL  ilv  liomogencous  from  San  Clemente  Island  to  2  kilometers  off-shore  of  San  Diego  (or  from 


788 


tlie  receiving  nntcnnn).  From  2  kilomelcrs  ofT-shore  )o  the 
receiving  nnlcnna  ihc  rcfractivily  tlicn  becomes  range 
dependent  based  on  tlie  San  Cleinoiitc  [slaiid  and  North  Island 
radiosonde  measurements,  The  result  is  sliown  in  Figure  10. 

Clearly,  tlie  predictions  given  by  the  “weighted”  range 
dependent  environment  now  show  much  better  agreement  with 
observations  than  the  predictions  given  by  the  two 
homogeneous  cnvironmenls. 

5.  CONCLUSIONS 

It  has  been  shown  that  in  a  coastal  environment.  Fields  are 
greatly  afieclcd  by  the  presence  of  terrain  even  when 
propagation  paths  arc  primarily  over  water.  Another  major 
elTcct  on  the  field  is,  of  course,  the  rcfractivity  environment. 

Horizontally  homogeneous  rcfractivity  environments  based  on 
soundings  taken  at  midway  and  at  either  end  of  the  path 
showed  good  agreement  with  measured  data.  Attempting  to 
describe  the  atmosphere  as  accurately  as  possible  by  modeling  a 
range  dependent  cnviromncnl  gave  only  slight  improvement  of 
comparisons  with  observations.  An  important  result  arose  from 
one  propagation  path  in  which  one  of  the  terminal  antennas 
was  located  far  away  from  the  mainland.  In  this  case,  a  range 
dependent  evironment  based  ecjually  on  the  refractivity 
measured  at  both  terminal  antenna  locations  did  not  produce  better  agreement  with  data,  but  in  fact  showed  the  worst 
agreement  overall.  This  was  most  likely  due  to  the  horizontally  stratified  refractivity  in  the  surrounding  ocean 
environment  which  changes  abruptly  very  near  large  land  masses.  Therefore,  more  thought  has  to  be  given  when 
applying  a  range  dependent  cnvironmcnl  in  coastal  areas. 
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Abstract 

The  Geometrical  Theory  of  Diffraction  has  been  used  to  model  propagation 
path  loss  in  the  presence  of  irregular  terrain  over  a  wide  range  of  frequen¬ 
cies,  path  lengths,  and  terrain  types.  The  objective  of  this  paper  is  to  de¬ 
scribe  the  fundamentals  of  this  modeling  approach  and  to  reveal  its  known 
capabilities  and  limitations  in  light  of  recent  work.  Some  comparisons  of 
modeled  and  measured  data  are  given  to  demonstrate  the  type  of  accuracy 
that  can  be  achieved  using  the  technique. 


Introduction  and  Background 

Since  the  late  1970’s,  the  Geometrical  Theory  of  Diffraction  (GTD)  has  been  used  to  model  the  effects 
of  irregular  terrain  on  radiowave  propagation  for  a  variety  of  applications  and  frequency  ranges.  For 
certain  applications,  GTD  has  provided  results  in  very  close  agreement  with  measured  data,  while  for 
other  applications  it  does  not  produce  close  agreement.  The  primary  purpose  of  this  paper  is  to  provide 
insights  into  the  types  of  conditions  where  a  GTD  propagation  model  would  be  appropriate  to  use,  and 
the  degree  of  accuracy  that  can  be  expected.  A  brief  history  of  the  technique,  along  with  pertinent  ref¬ 
erences,  is  given  below,  followed  by  an  overview  of  how  the  technique  is  implemented  in  a  contempo¬ 
rary  model.  Finally,  some  validation  data  are  shown  to  demonstrate  the  types  of  accuracy  that  are 
achievable. 

The  earliest  known  use  of  GTD  for  terrain-effect  modeling  involved  the  Instrument  Landing  System 
(ILS)  glide  slope  [1].  One  of  the  reasons  that  GTD  worked  well  for  this  application  is  that  for  the  ge¬ 
ometry  and  frequency  of  interest  (the  receiving  antenna  at  a  3°  elevation  angle  for  a  300  MHz  signal 
and  horizontal  polarization),  terrain  without  tree  cover  appears  to  be  a  good  conductor,  and  terrain 
features  are  large  with  respect  to  a  wavelength.  Subsequent  work  with  GTD  proved  it  capable  of  pro¬ 
viding  good  results  for  the  ILS  localizer  (100  MHz)  as  well. 

In  the  early  1980’s  [2],  GTD  was  implemented  in  a  general-purpose,  point-to-point  propagation  model. 
This  model  was  named  GELTI,  which  stands  for  GTD  Estimated  Loss  due  to  Terrain  Interaction 
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Validation  of  that  model  with  respect  to  measured  data  shows  it  to  be  very  accurate  for  paths  where  the 
number  of  rays  in  the  model  is  sufficient  to  account  for  the  significant  propagation  mechanisms,  and  for 
paths  where  the  troposphere  can  be  assumed  to  be  homogeneous.  While  it  is  recognized  that  other 
GTD-based  propagation  models  exist  [3,4,5]  the  GELTI  model  is  used  as  an  example  here  because  of 
its  proven  accuracy  and  because  of  the  author’s  familiarity  with  it. 

The  original  GTD  diffiaction  coefficients  were  developed  assuming  that  the  scatterer  was  smooth  and 
perfectly  conducting.  To  more  realistically  model  terrain  interactions,  those  diffraction  coefficients 
were  modified  to  account  for  finite  conductivity  and  local  surface  roughness.  Validation  work  with 
GELTI,  which  implements  those  modified  diffraction  coefficients,  has  shown  that  those  modified  dif¬ 
fraction  coefficients  do  provide  greater  prediction  accuracy  in  some  cases  [6],  At  this  point  in  time,  the 
range  of  frequencies  and  distances  over  which  GTD  terrain-effect  modeling  remains  accurate  is  not  fully 
known.  However,  close  agreement  between  measured  and  modeled  data  has  been  observed  from  as 
low  as  8  MHz  [7]  to  as  high  as  9  Hz.  Accuracy  has  also  been  seen  for  propagation  paths  up  to  roughly 
50  miles,  although  accuracy  appears  to  diminish  for  longer  distances,  likely  due  to  the  fact  that  other 
propagation  mechanisms  become  more  dominant. 

To  generate  estimates  of  path  loss  due  to  irregular  terrain,  all  GTD  models  use  a  piecewise-linear  ap¬ 
proximation  to  the  actual  terrain  profile.  Until  fairly  recently,  this  linearization  process  was  performed 
manually,  which  made  model  accuracy  dependent  upon  the  expertise  of  the  user  and  made  the  modeling 
process  quite  time  consuming.  However,  an  automated  approach  has  been  developed  [8]  that  enables  a 
piecewise-linear  profile  to  be  extracted  from  raw  terrain  data,  which  greatly  simplifies  the  modeling 
process  and  makes  model  results  user-independent.  The  automated  terrain  linearization  approach  is 
described  briefly  below. 

Most  GTD  terrain-effect  models  are  2-dimensional  in  that  only  the  terrain  directly  between  the  trans¬ 
mitting  and  receiving  antennas  is  considered  in  estimating  path  loss.  Experience  has  shown  that  for 
most  paths,  this  simplification  does  not  represent  a  major  deficiency  in  the  modeling  process,  and  is 
considerably  less  computationally  intensive  than  3-dimensional  approaches.  Further,  an  extension  of  the 
terrain  linearization  process  can  be  implemented  to  identify  multipath-causing  regions  on  the  3- 
dimensional  terrain  that  can  be  taken  into  account  when  calculating  path  loss. 


Overview  of  the  GELTI  Model 


The  GELTI  model  was  a  spin-off  of  earlier  work  modeling  ILS  performance  in  the  presence  of  irregular 
terrain.  Having  undergone  extensive  modifications  since  its  initial  implementation,  GELTI  currently 
estimates  propagation  path  loss  by  summing  the  contribution  of  ray  and  ray  combinations  listed  below; 


1 )  direct 

2)  reflected 

3)  diffracted 

4)  doubly-reflected 

5)  reflected-difffacted 

6)  doubly-diffracted 


9)  reflected-reflected  diffracted 

10)  difffacted-reflected-reflected 

1 1 )  difffacted-difffacted-reflected 

12)  diffracted-reflected-difffacted 

13)  reflected-difffacted-diflracted 

14)  reflected-reflected-reflected 
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7)  diffracted-retlected  15)  dififracted-diffracted-diffracted 

8)  reflected-diffracted-reflected  1 6)  adjacent-edge  diffracted 

The  contribution  of  the  direct  ray  will  be  calculated  if  there  is  no  blockage  between  the  transmitting  and  re¬ 
ceiving  antennas.  The  magnitude  and  phase  of  that  contribution  is  determined  by  free  space  loss,  and  both 
transmitting  and  receiving  antennas  are  assumed  to  be  isotropic.  To  model  applications  involving  high-gain 
antennas,  good  results  have  been  obtained  by  adjusting  model  results  by  the  antenna  gain  at  the  observation 
angle  of  interest  [9].  This  approach  to  modeling  high-gain  antennas  should  be  accurate  for  most  propagation 
paths,  since  departure  angles  from  the  transmitting  antenna  for  the  various  rays  tend  to  be  nearly  equal. 
However,  GELTI  can  readily  be  modified  to  apply  an  antenna  pattern  weighting  to  individual  rays  corre¬ 
sponding  to  different  departure  angles  if  the  application  warranted 

Reflected  ray(s)  will  exist  (there  may  be  more  than  one)  if  there  are  points  on  the  terrain  profile  where  the 
angle  of  incidence  is  equal  to  the  angle  of  reflection.  The  amplitude  and  phase  of  the  reflected  ray  is  deter¬ 
mined  by  the  complex-valued  reflection  coefficient,  computed  using  the  angle  of  incidence,  electrical  con¬ 
stants  of  the  ground  plane,  and  a  roughness  factor  representing  height  variability  in  local  terrain  such  as  vari¬ 
ability  caused  by  vegetation,  uneven  ground,  or  waves,  if  propagation  is  over  water,  This  roughness  factor 
does  not  account  for  gross  terrain  variations,  such  as  hills  or  ridges,  since  those  effects  are  computed  using 
GTD.  The  roughness  factor  is  used  to  modify  the  reflection  coefficient  to  account  for  imperfect  reflection 
caused  by  local  terrain  roughness 

Using  conventional  GTD,  the  amount  of  diffracted  energy  re-radiated  from  an  edge  is  determined  by  the  dif¬ 
fraction  coefficient,  which  is  a  function  of  the  wedge  angle,  and  the  incident  and  diffracted  ray  geometries. 
The  original  formulation  of  the  GTD  diffraction  coefficient  was  performed  by  Keller  [  1 0]  in  1962.  This  for¬ 
mulation  did  exhibit  singular  behavior  near  the  shadow  boundary  (where  the  direct  ray  contribution  is  dis¬ 
continuous)  and  reflection  boundary  (where  the  reflected  ray  contribution  is  discontinuous),  however,  which 
hindered  its  utility  as  a  modeling  tool.  This  problem  was  later  resolved  by  Kouyoumjian  and  Pathak  [11]  in 
their  development  of  the  Uniform  Theory  of  Diffraction  (UTD)  As  stated  above,  the  terms  GTD  and  UTD 
are  often  used  synonymously,  and  GELTI  employs  UTD  diffraction  coefficients 

Higher  order  rays,  such  as  the  reflected-diffracted  ray,  are  combinations  of  the  fundamental  rays.  Although 
higher-order  rays  tend  to  be  smaller  in  magnitude  than  the  first-order  rays  (i.e.,  the  direct,  reflected,  and  dif¬ 
fracted  rays),  comparisons  with  measured  data  show  that  these  rays  can  be  important  to  model  accuracy. 
Calculation  of  the  magnitude  and  phase  of  the  higher-order  rays  is  accomplished  by  accounting  for  cumula¬ 
tive  losses  and  phase  shifts  due  to  free-space  propagation,  reflection,  and  diffraction. 

GELTI  computes  the  total  field  at  the  receiver  by  taking  the  complex  sum  of  all  possible  ray  combinations. 
The  contribution  of  each  ray  is  determined  explicitly  within  the  code,  enabling  individual  rays  to  be  modified 
for  antenna  pattern  or  vegetation  effects  [  1 2] 

Earth  curvature  is  presently  accounted  for  by  adjusting  the  terrain  elevation,  and  hence  will  not  properly 
model  a  stratified  troposphere.  However,  the  possibility  of  modifying  GELTI  to  model  such  effects  is  being 
investigated  [13]. 
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In  order  to  investigate  all  possible  combinations  of  each  ray  type  for  a  complex  terrain  profile,  a  large  amount 
of  computation  time  is  required  To  reduce  execution  time,  it  is  assumed  that  some  ray  types  can  be  ignored 
without  significantly  affecting  the  estimated  signal  strength.  In  GELTl,  it  is  assumed  that  all  back-scattering 
rays  can  be  ignored.  Because  of  this,  GELTl  is  a  forward-scatter  model,  and  cannot  be  used  to  estimate 
backscatter  ifom  terrain  or  other  obstacles  without  modification. 

Automated  Terrain  Linearization 


A  problem  common  to  most  terrain-sensitive  propagation  models  is  establishing  the  parameters  defining  the 
terrain  profile  [14].  For  GELTl,  it  is  defining  a  piecewise-linear  terrain  profile  to  represent  the  actual  terrain. 
Experience  has  shown  that  identifying  the  appropriate  linear  profile  is  the  most  significant  factor  affecting 
model  accuracy.  However,  because  of  limited  validation  work  with  the  model  and  the  complexity  of  the 
problem,  the  linearization  process  has  remained  somewhat  of  an  art,  requiring  considerable  insight  on  the  part 
of  the  user.  Consequently,  a  significant  effort  has  been  dedicated  to  establishing  and  automating  a  methodol¬ 
ogy  for  creating  linear  profiles  from  raw  terrain  data  to  be  used  as  input  to  GELTl.  The  result  of  that  effort 
is  the  Automated  Terrain  Linearization  Model  (ATLM),  that  reads  raw  terrain  data,  and  generates  a  line¬ 
arized  profile  that  can  be  used  as  input  to  the  GELTl  model.  The  development  of  the  terrain  linearization 
process  outlined  below  uses  examples  relating  to  the  Microwave  Landing  System  (MLS),  which  operates  at 
a  frequency  of  around  5  GHz.  However,  the  approach  scales  for  frequency,  and  has  been  shown  to  work 
well  over  at  least  the  same  range  of  frequencies  for  which  GELTl  has  been  validated. 

Representing  an  actual  terrain  profile  by  straight  line  segments  assumes  that  some  of  the  information  con¬ 
tained  in  that  profile  can  be  ignored  in  the  modeling  process.  Thus,  the  objective  in  linearizing  a  profile  is  to 
assess  which  parts  of  the  profile  will  or  will  not  affect  propagation,  and  then  to  approximate  the  parts  that 
will  affect  propagation  by  linear  segments  of  appropriate  slope  and  height.  As  a  general  rule,  terrain  will  af¬ 
fect  propagation  if  it;  1)  blocks  a  ray  trajectory,  2)  supports  reflection,  or  3)  contains  an  edge  that  will  re¬ 
radiate  significant  diffracted  signal  energy.  These  criteria  are  dependent  upon  antenna-terrain  geometry  and 
frequency,  and  hence  the  optimal  linearized  profile  will  be  dependent  upon  those  parameters  as  well. 

Another  goal  in  linearizing  a  profile  is  to  keep  the  number  of  linear  segments  representing  a  terrain  profile  to 
a  minimum.  Including  too  many  edges  in  a  linear  profile  degrades  model  accuracy  for  essentially  three  rea¬ 
sons:  1 )  the  model  will  not  be  capable  of  representing  some  ray  trajectories  with  the  sixteen  available  ray 
types,  2)  some  high-fi'equency  assumptions  fundamental  to  GTD  will  be  violated  as  edges  become  closer  to¬ 
gether,  and  3)  computational  errors  resulting  from  the  inclusion  of  multiple  complex  number  calculations  will 
become  significant. 

The  approach  used  here  to  determine  what  segments  of  the  actual  terrain  affect  propagation  was  inspired  by 
Fermat's  principle  for  edge  diffraction  [  1 5,]where  the  stationary  phase  points  on  the  terrain  profile  are  used  to 
identify  points  of  reflection  or  diffraction  that  need  to  be  represented  in  the  linearized  profile.  When  dealing 
with  actual  terrain,  there  are  often  multiple  stationary  phase  points  clustered  together,  which  may  be  caused 
by  roughness  in  the  terrain  surface.  To  distinguish  which  of  the  stationary  phase  points  are  caused  by  rough¬ 
ness,  the  linearizing  algoritlim  looks  at  the  nearest  neighboring  stationary  phase  points;  if  both  are  within  an 
empirically-determined  phase  value  of  the  candidate  point,  that  point  is  rejected.  That  phase  value  is  the  only 
empirical  variable  in  the  modeling  process,  and  experience  has  shown  that  a  value  of  3:1/8  radians,  which  cor- 
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responds  to  V«  Fresnel  zones,  produces  good  results.  Further,  results  are  relatively  insensitive  to  variations  in 
that  variable. 

Once  the  stationary  phase  points  have  been  identified,  linear  regression  is  used  to  determine  the  slope  and 
intercept  of  the  actual  terrain  on  either  side  of  the  stationary  phase  points.  Linear  regression  is  begun  at  the 
stationary  phase  point,  and  continues  until  the  phase  exceeds  the  empirical  value  described  above.  Thus,  us¬ 
ing  a  value  of  37r/8  radians  for  that  variable  will  mean  that  linear  regression  will  take  place  over  1 .5  Fresnel 
zones  to  determine  the  slope  and  intercept  of  a  reflection  point  on  the  terrain.  The  linear  profile  is  built  by 
piecing  together  the  slopes  and  intercepts  determined  by  linear  regression. 

Model  Validation 

As  stated,  early  work  with  GTD  terrain-effect  modeling  involved  the  ILS,  and  validation  work  for  that  appli¬ 
cation  was  oriented  towards  predicting  navigation  system  performance,  rather  than  in  predicting  absolute 
signal  strength.  The  first  major  validation  study  for  the  GELTI  propagation  model  was  funded  by  ECAC 
[16],  in  which  model  results  were  compared  against  measured  data  collected  by  the  Institute  for  Telecom¬ 
munication  Sciences.  A  conclusion  reached  in  that  study  was  that  the  GELTI  model  was  generally  more  ac¬ 
curate  than  other  contemporary  models  for  paths  less  than  50  km  in  length.  Ffowever,  no  attempt  was  made 
to  define  an  RMS  error  for  the  model  due  to  uncertainty  about  the  terrain  linearization  process,  which  was 
performed  manually  for  the  study,  as  well  as  uncertainty  about  the  quality  of  the  comparison  data  (the  meas¬ 
ured  signal  strength  data  and  the  terrain  profile  data). 

Follow-on  validation  and  improvements  were  performed  by  Luebbers  [17,18]  with  the  results  showing 
GELTI  to  be  a  viable  means  for  predicting  absolute  signal  strength  for  short-range  propagation  paths. 

In  1989,  a  study  was  undertaken  to  predict  signal  strength  for  the  microwave  landing  system  (MLS)  operat¬ 
ing  in  the  presence  of  humped  runways  [19].  One  of  the  byproducts  of  this  effort  was  the  development  of 
the  automated  terrain  linearization  process.  Further,  validation  was  performed  using  carefully  collected  ter¬ 
rain  and  measured  signal  strength  data  at  five  different  sites.  The  results  of  that  validation  show  that  when 
accurate  terrain  and  antenna  data  are  entered  in  to  GELTI,  agreement  between  measured  and  modeled  data 
can  be  within  several  dB. 

Figure  1  shows  plots  of  both  the  terrain  profile  and  one  comparison  of  measured  versus  modeled  data  The 
measured  data  were  collected  using  a  van  with  a  mast  that  would  vary  the  receiving  antenna  over  a  range  of 
heights  from  3  to  40  feet  above  the  ground,  the  signal  strength  plot  of  Figure  1  is  for  the  receiver  at  the  loca¬ 
tion  farthest  from  the  transmitting  antenna.  As  seen  in  the  plot  of  the  profile,  part  of  the  path  is  below  line  of 
sight,  and  three  diflfractive  edges  are  illuminated.  As  seen  in  the  plot,  the  agreement  between  measured  and 
modeled  data  is  excellent.  The  discontinuity  in  the  modeled  data  at  a  receiver  height  of  around  25’  is  caused 
by  the  fact  that  the  model  does  not  calculate  the  ray  that  is  reflected  after  being  adjacent-edge  diffracted. 

While  the  path  shown  is  relatively  short,  1 1,500’,  it  does  represent  over  58,000  wavelengths  at  5  GHz. 
Hence  these  data  can  be  used  to  infer  model  performance  on  longer  paths  at  lower  frequencies  assuming  that 
other  tropospheric  propagation  mechanisms,  such  as  ducting,  do  not  become  significant. 
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Figure  1  Terrain  Profile  and  Measured-Modeled  Data  Comparison  for  the  Denver  Stapleton  Run¬ 
way  09  MLS  Azimuth  Transmitter. 

Figure  2  plots  both  the  profile  and  a  comparison  of  measured  and  modeled  signal  strength  for  another 
site.  The  comparison  data  shown  in  the  plot  were  collected  at  the  location  marked  “Threshold 
RWY09”,  and  the  transmitting  antenna  was  at  the  lower  of  the  heights  shown.  Again,  good  agreement 
between  measured  and  modeled  data  is  evident. 


Conclusions 

Validation  data  provide  strong  evidence  that  GTD  can  be  used  to  accurately  predict  signal  behavior  in 
the  presence  of  irregular  terrain  over  a  wide  range  of  fi-equencies.  GTD  models  are  theoretically  cor¬ 
rect,  in  that  no  empirical  assumptions  are  made  about  the  behavior  of  signal  behavior.  Consequently, 
GTD  models  are  realistically  responsive  to  frequency  variations,  which  enables  them  to  provide  broad¬ 
band  propagation  path  information.  Further,  because  ray  trajectories  are  calculated,  signals  can  be  re¬ 
constructed  in  the  time  domain,  a  capability  not  offered  by  other  modeling  techniques. 

The  major  limitations  are  that  GTD  models  do  not,  at  this  point  in  time,  account  for  an  inhomogeneous 
troposphere,  and  that  its  results  become  inaccurate  when  the  model  implemented  does  not  have  a  suffi¬ 
cient  number  of  rays  or  ray  types  to  account  for  the  significant  propagation  mechanisms  for  a  particular 
path.  Experience  has  shown  that  these  limitations  cause  the  useful  accurate  range  for  GTD  terrain- 
effect  modeling  to  be  around  50  miles,  depending  upon  path  specifics. 
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Figure  2  Terrain  Profile  and  Measured-Modeled  Data  Comparison  for  Wilmington,  Deleware  Air¬ 
port  Runway  09  MLS  Azimuth  Antenna 

To  get  accurate  long-range  prediction  capability,  it  has  been  suggested  that  a  GTD  propagation  model 
be  coupled  with  a  parabolic  wave  equation  model,  which  is  sensitive  to  an  inhomogeneous  troposphere. 
In  such  a  hybrid  model,  the  GTD  model  would  generate  initial-condition  signal  strength  values  for  a 
region  around  a  transmitting  antenna,  and  the  parabolic  wave  equation  model  would  use  those  initial 
conditions  in  determining  how  the  wave  propagates  through  the  troposphere.  This  would  likely  provide 
an  improvement  in  prediction  accuracy  because  of  GTD’s  greater  sensitivity  to  terrain  variations. 
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Background 

Many  propagation  programs  have  been  used  to  model  radio  wave  propagation  over  ter¬ 
rain.  Terrain  is  modelled  by  specifying  elevations  at  various  ranges  from  the  transmitter.  The 
terrain  can  be  thought  of  as  a  series  of  half  planes  (or  knife  edges)  or  as  a  series  of  wedges  gener¬ 
ated  by  connecting  the  terrain  elevation  points.  Various  knife  edge  diffraction  methods,  the 
uniform  geometrical  theory  of  diffraction,  Fresnel-Kirchhoff  theory,  and  the  parabolic  equation 
approximation  are  applied  in  the  propagation  programs.  Programs  that  apply  diffraction  tech¬ 
niques  must  calculate  rays  that  diffract  off  intervening  terrain  features.  In  addition  some 
programs  calculate  rays  that  combine  diffractions  with  reflections. 

The  programs  tested  here  that  use  knife  edge  diffraction  reduce  the  number  of  knife  edges 
to  those  which  are  most  significant.  When  multiple  half  planes  interact  with  the  wave  along  the 
path,  the  interaction  is  approximated  using  various  strategies  combining  a  few  single  half  plane 
diffractions. 

For  programs  modelling  the  terrain  as  wedges,  phase  map  techniques  are  applied  to  deter¬ 
mine  the  significant  wedges  thus  reducing  the  effort  required  to  find  the  rays.  When  modelling 
terrain  as  a  series  of  wedges,  geometric  theory  of  diffraction  can  be  applied  to  model  the  interac¬ 
tion  with  each  wedge.  When  multiple  wedges  interact  with  the  ray,  single  wedge  diffractions  are 
combined. 

The  two  other  techniques  avoid  the  problems  of  diffraction.  When  using  Fresnel-Kirch¬ 
hoff  theory,  the  field  is  modelled  by  a  series  of  Huygen  sources  that  are  stepped  along  the  path. 
At  each  step  the  field  is  calculated  from  the  previous  step  using  Fresnel-Kirchhoff  theory  while 
accounting  for  reflection. 

The  second  method  is  to  step  the  field  along  the  path  and  approximate  the  scalar  wave 
equation  with  a  parabolic  equation.  Either  the  Fourier  split  step  method  or  the  finite  difference 
technique  are  applied  to  solve  the  parabolic  equation.  Since  the  parabolic  equation  method  was 
originally  developed  to  find  the  field  passing  through  atmospheric  irregularities,  a  change  of 
variables  transforms  the  terrain  variations  into  atmospheric  refractive  index  variations. 

The  purpose  of  the  effort  is  to  develop  a  standard  set  of  terrain  feature  to  test  using  the  pro- 


798 


grams  and  compare  the  programs  using  the  standard  set  of  terrain  features.  Graphics  presented 
show  the  terrain  features  and  the  comparisons  of  the  programs. 

Models 

The  following  programs  are  compared;  GTD  Estimated  Loss  due  to  Terrain  Interaction 
(GELTI)  developed  by  R.  Luebbers  and  K.  Chamberlin  [1],  the  Communications  Research  Cen¬ 
tre  (CRC)  program  developed  by  J.  Whitleker  [2],  Smooth  Earth  Knife  Edge  (SEKE)  as 
described  by  S.  Ayasli  [3],  Terrain  Integrated  Rough  Earth  Model  (TIREM)  developed  by  the 
Electromagnetic  Compatibility  Analysis  Center  [4],  Institute  for  Telecommunication  Sciences 
Irregular  Terrain  Model  (ITM)  [5],  and  Variable  Terrain  Radio  Parabolic  Equation  (VTRPE) 

developed  by  F.  Ryan  [6].  •  r  n  ^ 

GELTI  searches  for  up  to  sixteen  different  ray  types.  A  ray  type  consists  of  reflection  and 

diffraction  paths  off  terrain  features.  As  many  as  three  ground  reflections,  three  diffractmns,  or 
combinations  of  diffractions  and  reflections  can  form  a  ray  type.  Geometric  Theory  of  Diffrac¬ 
tion,  GTD,  is  used  to  calculate  the  diffraction  losses  [7].  The  terrain  profiles  used  by  GELTI  are 
preprocessed  by  the  Automated  Terrain  Linearization  Model  (ATLM).  ATLM  locates  the  signifi¬ 
cant  terrain  features  for  a  given  radio  link,  and  reduces  the  terrain  profile  to  a  simpler  version 

that  preserves  the  significant  features.  i  v. 

The  CRC  model  advances  the  field  as  an  array  of  Huy  gen’s  principle  sources  along  the 
path.  The  terrain  is  modelled  by  a  series  of  knife  edges  and  at  each  knife  edge  Fresncl-Kirchhoff 
theory  is  used  to  generate  the  field  from  the  previous  knife  edge.  The  reflected  wave  is  included 

by  connecting  each  knife  edge  with  a  reflecting  plane.  .  i  u 

SEKE,  TIREM,  and  ITM  identify  the  terrain  obstacles  along  the  path  ^d  model  them 
using  knife  edges.  The  diffracted  field  from  multiple  knife  edges  is  found  using  the  Epstein- 
Peterson  method  in  TIREM  and  ITM  and  with  a  modified  Deygout  method  in  SEKE.  SEKE  also 

accounts  for  the  reflected  wave.  .  r-  i.i  o 

VTRPE  uses  the  parabolic  equation  approximation  to  calculate  the  field.  Because  the  para¬ 
bolic  equation  method  does  not  treat  terrain  directly,  VTRPE  linearly  interpolates  between 
terrain  points  and  transforms  the  terrain  variations  into  atmospheric  irregularities.  VTRPE  steps 
the  field  in  range  dynamically  adjusting  the  range  step  to  minimize  the  error. 

Conclusions 

By  developing  a  standard  set  of  terrain  features,  propagation  program  results  can  be  eas¬ 
ily  compared  with  expected  phenomena  and  each  other.  In  summary,  a  diversity  of  propagation 
modeling  programs  show  both  similarities  and  differences  in  their  predictions.  Generally,  if  dit- 
ferent  programs  make  similar  assumptions,  the  more  likely  the  losses  and  loss  contours 
The  differences  between  results  indicate  that  more  interesting  work  can  be  accomplised  by 
examining  the  differences  induced  by  the  terrain  features  and  classifying  terrain  features  accord¬ 
ingly.  Continued  work  will  lead  to  better  predictions  by  all  propagation  prediction  programs. 

The  data  sets  used  for  the  comparisons  are  available  to  propagation  prediction  program 
developers.  Programs  to  view  data  are  also  available. 
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Abstract 


This  paper  describes  an  elementary  foliage  propagation  path  loss  model  which  repre¬ 
sents  a  forest  as  a  dissipative  dielectric  slab  lying  in  a  more  lossy  half-space 
represented  by  the  ground.  This  physical  concept  was  originally  proposed  by  D.J. 
Pounds  and  A.H.  LaGrone'  and  further  investigated  by  Theodor  Tamir'.  Our  Electro¬ 
magnetic  Wave  Attenuation  in  a  Forest  (EWAF)  model  uses  empirical  foliage  path 
loss  information  as  a  basis  and  therefore  is  closely  tied  to  actual  path  losses 
encountered  in  real-world  situations. 


The  EWAF  model  includes  several  features:  the  effects  of  antennas  within  the  forest, 
outside  the  forest,  and  above  the  forest;  wave  polarization;  forest  density;  canopy 
trunk  and  undergrowth  losses;  antenna  beamwidth;  wet  foliage;  lush  foliage;  and 
many  other  physical/forest  conditions.  It  is  implemented  on  a  Sun/SPARC  network 
and  provides  a  good  estimator  for  foliage  losses  anywhere  in  the  world. 


A.  Introduction 

For  over  a  half-century,  communicators  have  been  concerned  with  radio  frequency  (^) 
nath  loss  through  foliage.  Over  the  past  30  years,  measurements  and  analyses  have  shown  that  actual 
tosses  encountered  within  forests  are  considerably  less  than  had  been  estimated  for  propagation  paths 
straight  through  foliage.  Pounds  and  LaOrone'  hypothesized  in  1963  that  propagation  loss  through 
Mtoge  could  be  represented  by  propagation  through  a  dielectric  slab,  where  the  antennas  are  sub- 
merged  in  the  dielectric  media. 

Tamir»  expanded  this  concept  by  developing  the  mathematics  describing  the  dielectric  slab  propaga¬ 
tion  phenomenon.  The  literature  contains  much  measured  data  for  propagation  losses 
around  and  in  the  vicinity  of  foliage.  Although  mathematical  models  have  been  prepared  that 
describe  this  propagation,  very  little  has  been  done  to  provide  communicators  wtth  a  computerized 
tool  for  predicting  foliage  losses. 
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This  paper  describes  a  first  step  in  the  development  of  the  EWAF  computer  model,  which  uses 
plausible  inputs  to  estimate  foliage  losses.  The  model  can  be  used  by  communications  engineers  to 
obtain  an  estimate  of  propagation  losses  through  a  forest  or  other  foliage  situations  expected  for  a 
given  set  of  operating  conditions.  The  estimates  will  be  useful  for  communications  planners  in 
situations  where  transmitter  and  receiver  antennas  (or  perhaps  only  one)  will  be  immersed  in  foliage. 
Representing  the  foliage  by  a  dielectric  slab  will  provide  an  estimate  of  foliage  losses  which  could 
be  expected  in  situations  where  meager  information  is  available  regarding  the  wood  density. 

The  information  available  on  propagation  loss  through  foliage  sometimes  refers  to  the  area  in  square 
feet  of  wood  per  acre  of  forest  and  assumes  a  uniform  density  throughout  the  forested  region.  In 
most  instances,  it  is  not  possible  to  measure  the  wood  density  per  acre.  Furthermore,  for  large 
forested  areas,  the  forest  density  is  quite  variable. 

The  EWAF  model  is  not  a  rigorous  mathematical  treatment  of  a  forest  represented  by  a  dielectric 
slab.  It  is  actually  based  upon  previously  measured  data  for  sparse  and  dense  forests  in  the  United 
States  as  well  as  very  dense  forests  in  India,  as  described  by  Tewari  et  ah'*,  where  the  average  annual 
rainfall  ranges  to  3000  millimeters.  The  model  was  implemented  using  Ada  programming  language 
and  is  part  of  a  larger  graphic  user  interface  interactive  engineering  tool,  the  Analysis  Software 
Environment.  The  model  is  resident  on  a  Sun  network  at  the  Electronic  Proving  Ground,  Fort 
Huachuca,  Arizona,  of  the  White  Sands  Missile  Range. 

B.  Propagation  of  Radio  Waves  Through  a  Dielectric  Slab 

Radio  (or  electromagnetic)  waves  passing  through  foliage  represented  by  a  lossy  dielectric  slab 
commonly  incur  path  loss  rates  of  perhaps  0.25  decibels  per  meter  (dB/m)  in  excess  of  free  space 
[referred  to  as  excess  path  loss  (EPL)].  For  path  lengths  of  I  kilometer,  for  example,  the  attenuation 
of  a  wave  due  to  foliage  would  be  250  dB.  Communication  over  radio  links  having  such  losses 
would  usually  be  impossible  or  very  difficult.  Flowever,  experimental  results  indicate  that  the  EPL 
due  to  foliage  is  actually  much  less,  more  like  40  or  50  dB  at  400  megahertz  (MFIz). 

These  differences  between  expected  and  actual  results  led  Pounds,  LaGrone,  and  Tamir  to  the 
dielectric  slab  theory,  where  the  wave  travels  up  through  the  medium  (incurring  high  losses)  and 
then  travels  laterally  along  the  boundary  region  at  a  much  lower  loss  rate.  Figure  1  depicts  the 
propagation  path  geometry.  This  propagation  in  the  boundary  region  is  called  the  lateral  wave  and 
allows  communication  over  much  greater  distances  than  would  be  possible  otherwise. 

When  the  RF  propagation  is  compared  to  optics  and  the  refraction  encountered  when  a  wave  passes 
from  a  more  dense  medium  to  a  less  dense  medium,  the  geometry  becomes  equivalent  to  Figure  2. 
(The  more  dense  medium  is  the  forest,  the  less  dense  medium  is  the  air.)  A  large  portion  of  the  wave 
escapes  through  the  boundary  into  space  for  high  angles  of  incidence  with  the  boundary  region.  As 
the  angle  of  incidence  decreases,  the  refracted  wave  at  the  surface  becomes  more  nearly  parallel  to 
the  boundary.  At  a  critical  angle  of  incidence,  the  wave  travels  parallel  to  the  air-forest  boundary 
and  forms  the  lateral  wave,  which  incurs  fairly  low  loss  and  permits  communication  over  extended 
distances.  Associated  with  the  lateral  wave  is  a  lossy  wave,  called  the  leakage  field,  that  travels  back 
down  into  the  dielectric  slab.  The  leakage  field  provides  the  route  down  to  the  receiving  antenna, 
which  may  be  submerged  in  the  forest  at  some  distance  from  the  transmitting  antenna.  The  leakage 
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Figure  I.  Propagation  path  along  a  dielectric  slab 


field  is  mostly  attenuated  by  the  forest  and  the  underlying  ground  and,  for  practical  purposes,  does 
not  reflect  back  out  of  the  forest.  The  receiving  antenna  may  be  within  the  forest,  above  the  forest, 
or  beyond  the  forest.  The  model  handles  these  cases  when  the  distance  from  the  forest  is  not  more 
than  two  or  three  forest  heights  (see  para  H).  Below  the  critical  angle  of  refraction,  the  wave  is 
reflected  back  into  the  lossy  dielectric  and  is  highly  attenuated  by  the  foliage  and  earth  beneath. 


Figure  2.  Refraction,  reflection,  and  lateral  waves 
and  leakage  field  due  to  differences  in  medium  density 


C.  Establishing  Foliage  Loss  Rate 

The  relationship  between  the  conductivity  of  the  forest  slab  and  the  through-foliage  loss  for  various 
frequencies  is  represented  by  the  equation: 


=  -20  log  e 


(1) 


where 

Lb  =  loss  rate  in  dB/in  (in  the  forest) 

e  =  2.71828 

CT  =  conductivity  in  siemens 

f  =  frequency  in  MHz. 

This  is  the  theoretical  rate  of  attenuation  of  radio  waves  passing  through  a  thin  screen  of  trees  as 
derived  from  Tamir’s  work  by  F.A.  Losee^  and  further  evaluated  by  Kivett  and  Diederichs^.  This 
represents  the  propagation  losses  experienced  by  a  direct  wave  through  foliage  and  has  close  agree¬ 
ment  with  empirically  collected  data.  The  family  of  curves  for  various  conductivities  representing 
this  loss  is  shown  in  Figure  3.  The  conductivities  are  directly  related  to  the  densities  of  the  forested 
region.  The  top  curve  represents  a  very  dense  forest  and  the  bottom  curve  represents  sparse  forest 
conditions.  In  the  EWAF  model,  the  engineer  can  select  the  forest  density  (represented  by  the  con¬ 
ductivity)  of  the  area  for  which  estimates  are  desired. 


Figure  3.  Frequency  versus  loss  rale 
for  several  conductivilies 


Data  are  not  available  for  all  effects  at  all  frequencies.  Most  effects  are  known  at  certain  frequencies; 
for  example,  recent  rain  on  a  certain  forest  at  437  MHz  was  known  to  cause  3-dB  increase  in  path 
loss  over  a  known  path.  This  can  be  equated  to  a  specific  increase  in  the  conductivity  of  the  forested 
region.  Solving  Equation  1  for  conductivity  yields: 

.00331361^^  ^2) 

a  ~ 

where 

Lb  =  loss  rate  in  dB/m 
a  =  conductivity  in  siemens 
f  =  frequency  in  MHz. 
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Equations  1  and  2  can  be  used  to  calculate  conductivity  at  a  certain  frequency  from  a  known  path 
loss  under  particular  foliage  conditions  and,  from  this  value,  determine  the  path  loss  at  other 
frequencies  for  these  same  conditions.  The  EWAF  model  makes  extensive  use  of  this  technique  to 
evaluate  the  effect  of  the  density  of  the  undergrowth,  leaf  moisture  content,  rain-wet  foliage,  and 
other  conditions  of  the  forested  region  which  change  the  conductivity  of  the  dielectric  slab.  The 
lateral  wave  loss  rate,  typically  one-tenth  to  one-fifteenth  of  the  foliage  loss  rate,  is  similarly 
adjusted. 

While  Equation  1  fits  most  of  the  empirical  data,  some  empirical  data  may  show  greater  or  lesser 
changes  with  respect  to  frequency.  Some  are  not  hard  data  and  may  have  been  taken  at  a  limited 
number  of  frequencies  (as  little  as  two).  Such  a  small  sample  may  not  be  representative  of  the  true 
variation  of  foliage  loss  rate  with  change  in  frequency.  Often,  an  author  will  have  a  limited  amount 
of  data  and  will  draw  a  foliage  loss  curve  through  the  data  without  considering  the  normal  signal 
strength  variations.  Such  a  curve  cannot  be  satisfactorily  extrapolated  to  other  frequencies. 

D.  The  Effect  of  Wave  Polarization 

Most  of  the  available  empirical  data  show  an  increase  in  foliage  loss  for  vertically  polarized  electro¬ 
magnetic  waves.  The  Consultive  Committee  on  International  Radio  (CCIR)  has  issued  a  recom- 
mendation'^  which  includes  the  added  losses  encountered  when  using  vertical  polarization.  The 
recommendation  covers  the  propagation  losses  for  "an  approximate  average  for  all  types  of 
woodland."  The  CCIR  data  are  plotted  in  Figure  4. 


Figure  4.  Specific  attemtation  of  woodland  (CCIR  J  992) 


The  CCIR  recommendation  regarding  polarization  was  incorporated  into  the  foliage  loss  model  for 
frequencies  below  1.3  gigahertz  (GHz).  The  EWAF  model  improves  somewhat  on  the  CCIR 
recommendation  by  allowing  for  differences  in  vegetation  through  changes  in  conductivity  of  the 
dielectric  forest  slab. 
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E.  Antenna  Beamwidth  Effects 


The  beamwidths  of  the  transmitting  and  receiving  antennas  play  a  significant  role  in  the  additional 
losses  experienced  due  to  propagation  through  foliage.  This  is  particularly  true  for  microwave  links 
or  tropospheric  scattering  links.  When  microwave  or  tropospheric  scattering  links  are  established 
within  a  forest,  planners  must  take  into  account  that  the  maximum  range  will  be  considerably 
reduced  from  operation  not  in  a  forest.  The  increased  path  attenuation  calculation  in  the  EWAF 
model  is  based  upon  the  antemia  response  characteristics  off-boresight  for  narrow  beamwidth 
antennas. 

F.  Foliage  Water  Content 

Most  researchers  who  collect  empirical  data  in  the  field  conclude  that  the  attenuation  in  a  forested 
region  is  primarily  due  to  the  amount  of  moisture  on  or  within  the  vegetation.  To  quantify  this 
effect,  most  data  collectors  accompany  the  data  with  comments  regarding  the  lushness  of  the  vegeta¬ 
tion  and  the  presence  or  absence  of  moisture  on  the  foliage.  The  preparers  of  the  EWAF  model  gave 
special  attention  to  these  parameters.  Within  the  model,  the  conductivity  of  the  dielectric  slab  is 
adjusted  to  reflect  comments  regarding  moisture  on  the  foliage,  lushness  of  the  canopy,  and  lushness 
of  the  undergrowth  to  accurately  reflect  the  effect  of  the  estimated  overall  water  content. 

G.  Effect  of  Forest  Canopy,  Exposed  Tree  Trunks,  and  Undergrowth 

The  density  of  the  forest  canopy,  the  volume  of  the  wooded  trunks,  and  the  density  of  the  under¬ 
growth  all  affect  the  propagation  path  losses  within  a  forest.  Forests  with  relatively  light  canopies 
will  transmit  sunlight  to  the  trunks  and  to  the  ground.  The  additional  sunlight  penetration  into  the 
forest  tends  to  increase  the  foliage  on  the  low'er  parts  of  the  trees  and  increase  the  undergrowth.  For 
example,  Germany’s  Black  Forest  has  a  dense  canopy,  trunks  that  are  not  foliated,  and  no  under¬ 
growth.  The  result  is  a  dark  cavern  carpeted  with  pine  needles  fallen  from  the  high  canopy.  Other 
forests,  like  those  in  the  panhandle  of  Florida,  have  such  lush  undergrowth  that  it  is  difficult  to 
simply  walk  through. 

The  EW’AF  model  allows  the  engineer  to  specify  the  density  of  the  canopy,  the  tree  trunk  exposure, 
and  the  height  and  density  of  the  undergrowlh.  The  treatment  of  these  layers  within  the  dielectric 
slab,  w'hile  not  rigorous,  does  allow'  some  consideration  of  these  factors.  In  the  future,  the  EWAF 
model  w'ill  allow'  a  rigorous  treatment  of  these  forest  layers. 

H.  Propagation  Outside  the  Forest 

The  EWAF  model  is  intended  primarily  to  provide  an  estimate  of  the  electromagnetic  propagation 
path  losses  through  a  forested  region  when  both  the  transmitting  and  receiving  antennas  lie  within 
the  forest.  Occasionally,  however,  either  one  or  the  other  of  the  antennas  is  above  the  forest  or 
beyond  the  forest.  Empirical  data  are  available  to  assist  in  the  estimation  of  path  losses  extending 
out  of  the  forest.  In  the  case  of  antennas  above  the  tops  of  the  trees,  data  are  available  relating  the 
slope  angle  of  the  path  from  the  top  of  the  trees  to  the  antenna  as  related  to  the  lateral  wave  loss 
rate.  These  slope  angles  are  very  small  in  most  cases  and  involve  propagation  in  the  lateral  wave 
mode  for  an  appreciable  portion  of  path. 
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Transmission  beyond  the  edge  of  the  forest,  on  the  other  hand,  requires  a  transition  from  the  lateral- 
wave  mode  to  a  free-space  mode  which  resembles  wave  diffraction  over  an  obstacle.  Transmission 
beyond  the  edge  of  the  forest  is  highly  dependent  upon  the  diffraction  angle  from  the  lateral  wave 
(parallel  to  the  ground)  to  the  receiving  antenna  near  the  ground.  Figure  5  shows  the  transition  from 
the  lateral-wave  mode  to  the  free-space  mode  at  a  diffraction  angle. 


I.  Model  Input  Variables 


The  model  accepts  a  wide  variety  of  input  parameters  which  describe  the  physical  conditions  under 
analysis.  These  parameters  allow  the  model  to  ascribe  physical  constraints  to  the  simulation  and 
provide  estimates  of  propagation  loss  through  foliage  commensurate  with  the  scenario.  These 
parameters  are: 


•  Carrier  Frequency 

•  Forest  Density 

•  Link  Length 

•  Type  of  Foliage  (conifer/deciduous) 

•  Height  of  Transmitting  Antenna 

•  Height  of  Receiving  Antenna 

•  Height  of  Exposed  Trunks 


•  Receiver  Distance  to  Forest  (or  in  forest) 

•  Transmitter  Distance  to  Forest  (or  in  forest) 

•  Presence  of  Moisture  on  Foliage 

•  Density  of  Undergrovvlh 

•  Polarization  of  Electromagnetic  Wave 

•  Lushness  of  Forest 

•  Height  of  Undergrowth 


The  term  "forest  density"  is  used  in  the  EWAF  model  as  the  input  to  establish  propagation  param¬ 
eters.  This  parameter  is  stated  in  terms  of  visual  observation  of  the  forest  as  light,  medium,  dense, 
very  dense,  jungle,  and  heavy  jungle.  These  values  are  then  equated  to  the  conductivity  of  the  forest. 
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J.  Concluding  Remarks 

The  EWAF  model  provides  the  design  engineer  or  communicator  with  a  tool  which  can  be  used  to 
estimate  probable  propagation  path  losses  through  vegetation.  The  model  is  not  mathematically  rigor¬ 
ous  but  instead  is  based  upon  empirical  data  which  produce  results  surprisingly  similar  to  losses 
which  are  experienced  in  the  field.  It  requires  input  parameters  which  are  available  or  can  be 
obtained  without  field  measurement.  Constants  and  factors  are  contained  in  tables  easily  changeable 
for  fine  tuning  the  model  as  more  empirical  data  are  analyzed.  The  prediction  of  electromagnetic 
wave  propagation  path  losses  through  layers  of  the  forest  could  be  improved  with  additional 
analyses. 
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Abstract.  The  Radio  Physical  Optics  (RPO)  model  is  a  hybrid  electromagnetic  propagation  model  composed  of  ray-optics 
and  parabolic  equation  techniques  to  account  for  propagation  effects  over  the  ocean.  The  ray-optics  sub-models  provide 
computer  efficient  solutions  in  their  regions  of  applicability;  the  parabolic  equation  sub-model  provides  the  solution  m  the 
lower  elevation  angle  regions  and  accommodates  range-dependent  refractive  conditions.  The  model  has  been  fully 
documented  and  a  U.S.  patent  has  been  allowed  on  the  hybrid  techniques.  RPO  is  considered  applicable  to  the  nominal 
frequency  range  of  100  MHz  to  20  GHz  and  has  been  shown  to  be  more  computationally  efficient,  in  both  speed  and 
memory  usage,  than  conventional  split-step  parabolic  equation  models. 

The  scientific  validation  of  RPO  has  consisted  of  3  types  of  comparisons:  (1)  model  to  data,  (2)  model  to  model, 
and  (3)  internal  consistency.  Model  to  data  comparisons  have  been  made  for  normal  and  anomalous  propagation 
conditions,  including  homogeneous  and  range-varying  surface-based,  elevated,  and  evaporation  ducts.  Model  to  model 
comparisons  consist  of  comparing  RPO  outputs  with  the  outputs  of  other  high-fidelity  models  for  the  same  input  data. 
For  this  purpose,  diverse  propagation  modeling  techniques  like  waveguide  and  split-step  parabolic  equation  methods  were 
used  as  standards  of  comparison.  Model  consistency  comparisons  examined  the  ability  of  the  numerical  implementation 
of  RPO  to  reproduce  reciprocity  for  range-varying  refractive  conditions  in  accordance  with  the  Lorentz  Reciprocity 
Theorem.  This  paper  shows  example  results  for  the  three  types  of  validation  conducted  on  RPO. 


L  BACKGROUND 

The  Radio  Physical  Optics  (RPO)  electromagnetic  propagation  model  is  a  hybrid  model 
composed  of  ray-optics  and  parabolic  equation  (PE)  techniques  developed  by  H.  V.  Hitney  [1]  to 
account  for  propagation  effects  over  the  ocean.  The  impetus  for  this  development  was  the  requirement 
in  the  Navy  for  a  propagation  assessment  model  that  would  extend  the  existing  capability  that  assumes 
horizontal  homogeneity  of  the  refractive  structure  to  accommodate  range-varying  refractive  structure. 
Split-step  PE  models  had  become  widely  accepted  for  this  purpose  in  the  lower  atmosphere  but  had 
the  disadvantage  of  requiring  increasing  computational  resources  with  increasing  frequency,  higher 
elevation  angles,  and  higher  altitudes  to  the  point  of  being  impractical  for  a  full  range  of  operational 
applications.  Ray-optics  techniques  are  very  computer  efficient  for  assessing  propagation  within  the 
radio  horizon  but  become  intractable  beyond  that  range.  The  concept  of  combining  these  two 
techniques  into  a  computer  efficient  model  is  simple,  but  the  implementation  of  such  a  model  was  a 
significant  achievement  and  U.S.  Patent  No.  5,301,127  was  awarded  for  the  hybrid  techniques. 

The  successful  development  of  an  entirely  new  model  then  raises  the  question  of  how  good  the 
model  is.  It  must  be  as  accurate  as  the  existing  models  and  it  must  correlate  with  radio/radar 
observations  in  order  to  be  accepted.  The  scientific  validation  RPO  has  consisted  of  3  types  of 
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comparisons:  model  to  data,  model  to  model,  and  internal  consistency.  Validation  of  RPO  is  the 
combined  work  of  a  number  of  people  resulting  in  several  work-years  of  effort. 

It  is  also  necessary  to  distinguish  what  is  meant  by  validation.  By  scientific  validation,  I  mean 
the  measurement  of  the  physical  and  numerical  credibility  of  the  model.  In  terms  familiar  to  the 
software  engineer,  independent  verification  and  validation  (IV&V)  deals  more  with  the  numerical 
credibility  of  the  model  and  how  well  that  model  addresses  the  requirements  that  drove  its 
development.  A  final  term  I  also  define  is  tactical  validation  which  is  a  measure  of  the  physical 
credibility  of  an  assessment  system.  An  assessment  system  is  not  only  the  propagation  model  and  the 
models  that  characterize  the  refractive  environment,  but  also  the  system  performance  models  that 
characterize  detection,  intercept,  or  communications  criteria.  Tactical  validation  may  depend  upon 
the  application.  For  example,  assessments  of  radar  coverage  for  aircraft  penetration  of  air  defenses 
require  only  depicting  the  relative  detection  capability;  whereas  an  assessment  of  detection  ranges  for 
a  ship’s  search  radar  against  cruise  missiles  must  depict  absolute  detection  range  with  high  fidelity. 

II.  MODEL  TO  DATA  COMPARISONS 

One  can  not  absolutely  validate  a  propagation  model  by  comparison  to  radio  data  because  the 
propagation  model  is  dependent  upon  an  accurate  characterization  of  the  meteorological  conditions  and 
its  output  is  compared  to  radio  data  that  are  measured  by  systems  that  have  limits  in  their  accuracies. 
The  latter  can  be  quite  tightly  controlled,  but  obtaining  the  meteorological  data  on  a  sufficient 
temporal  and  spatial  scale  to  fully  characterize  the  propagation  environment  is  not  likely  [2],  Thus, 
there  will  always  be  differences  between  the  model  predictions  and  the  measured  data  and  the  tradeoff 
is  accuracy  versus  cost  [3]. 

A  convenient  measure  of  effectiveness  is  whether  or  not  the  model  provides  improvement  over 
either  free  space  or  standard  atmosphere  propagation.  Comparisons  of  RPO  predictions  have  been 
made  to  data  for  normal  and  anomalous  propagation  conditions  and  documented  in  technical  reports 
and  refereed  journal  articles  [4-9] . 

An  example  of  model  to  data  comparisons  for  a  range-varying  elevated  trapping  layer  utilizes 
radio  and  meteorological  data  collected  by  an 
aircraft  flying  a  sawtooth  pattern  between  San 
Diego,  CA  and  Guadalupe  Island  [5].  Figure 
1  shows  the  refractivity  profiles  on  one  day  that 
is  characterized  by  a  surface-based  duct  in  the 
vicinity  of  the  transmitter  that  lifts  downrange 
to  become  an  elevated  duct.  Figure  2  shows 
coverage  for  a  520  MHz  transmitter  located  at 
30.5  m  overlooking  the  ocean  under  the 
measured  refractive  conditions.  At  ranges 
greater  than  100  km,  energy  that  was  trapped 
in  the  duct  near  the  transmitter  tends  to  follow 
the  rise  of  the  duct  with  increasing  range.  The 
RPO  output  at  a  range  of  148  km  for  these 
refractive  conditions  is  compared  to  the 
measured  data  in  Figure  3.  The  comparison 
between  predicted  signal,  in  terms  of 
propagation  factor,  and  observed  signal  is  Figure  1.  Modified  refractivity  profiles  vs  range 
considered  good,  particularly  when  compared  from  Point  Loma  in  San  Diego. 
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Figure  2.  RPO  predicted  propagation  loss  for  the  environment  characterized  in  Figure  1. 


to  the  standard  atmosphere  case.  The  effects  of  ducting  are  readily  apparent  in  comparison  to  the 


predicted  propagation  factor  for  a  standard 
atmosphere.  (Barrios  [5]  obtained  qualitatively 
better  comparisons  for  these  data  by  using  the 
predicted  propagation  factor  along  the  aircraft 
slant  path;  for  simplicity,  this  was  not  done 
here.) 

A  second  example  [9]  comes  from  an 
evaporation  duct  experiment  in  a  littoral  area  in 
the  Aegean  in  which  radio  propagation 
measurements  were  made  in  the  1  to  40  GHz 
frequency  range.  Figure  4  shows  data  at  9624 
MHz  (dots)  compared  to  range  dependent  RPO 
predictions  (solid  line)  for  low-sited  antennas 
(transmitter  at  4.5  m  and  receiver  at  4.9  m). 
Surface  meteorological  measurements  on  the 
two  islands  of  Naxos  and  Mykonos,  35.2  km 
apart,  were  used  to  characterize  the  evaporation 


Figure  3.  Comparison  between  RPO  and  measured 
radio  data  at  a  nominal  range  of  148  km. 
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duct  structure  (any  influences  from  elevated 
refractive  structures  are  thus  not  accounted 
for).  Signals  varied  nearly  77  dB,  from  above 
free  space  to  less  than  standard  diffraction, 
over  this  15  day  period.  Even  though  quite 
crude  meteorological  data  were  used  to  depict 
range-varying  refractive  structure,  RPO  is 
considered  to  follow  the  trends  quite  well  and, 
with  regards  to  absolute  accuracy,  is  within  10 
dB  of  the  observed  signal  nearly  70%  of  the 
time.  In  particular,  propagation  loss 
predictions  for  19  through  22  November  are 
improved  using  range-dependent  refractive 
structure  to  drive  RPO  [9]. 

A  third  example  comes  from  [7]  and 
shows  a  comparison  of  RPO’s  troposcatter 
model  to  data.  The  troposcatter  model  is  a 
semiempirical  scatter  model  that  adds  a  random 
refractive-index  fluctuation  to  the  mean 
refractive-index  value  at  each  height  for  which 
the  parabolic  equation  submodel  computes  propagation  loss.  Without  a  troposcatter  model,  RPO 
would  calculate  diffraction  losses  that  could  far  exceed  what  is  observed.  Although  insignificant  for 
radar  applications,  excessive  losses  could  result  in  erroneous  assessments  of  communications  or 
electronic  warfare  intercept  capabilities.  Figure  5  shows  propagation  loss  versus  range  for  a  220  MHz 
transmitter  at  23.5  m  above  sea  level  as  derived 
from  measurements  by  an  aircraft  flying  at  an 
altitude  of  152  m  between  Scituate,  MA  and 
Sable  Island.  RPO  without  scatter 
overestimates  propagation  losses  by  up  to  50 
dB  at  ranges  beyond  about  150  km.  RPO  with 
scatter  predicts  propagation  losses  consistent 
with  the  measured  data.  Also  shown  in  Figure 
5  is  the  propagation  loss  predicted  by  EREPS 
(Engineer’s  Refractive  Effects  Prediction 
System),  which  will  be  amplified  in  the  next 
section. 

Additional  comparisons  to  data, 
including  the  effects  of  elevated  trapping  layers 
and  evaporation  ducts  short  ranges,  are  shown 
in  references  [4-9]. 

III.  MODEL  TO  MODEL  COMPARISONS 

Discrepancies  between  data  and 
propagation  model  predictions  can  occur  for 
several  reasons  including  inadequacies  in  the  propagation  model,  the  meteorological  characterizations, 
and  radio-meteorological  measurement  errors.  Thus  RPO  was  also  compared  to  other  high-fidelity 


Figure  5.  Comparison  of  observed  and  predicted 
(RPO  with  scatter,  RPO  without  scatter,  and 


EREPS)  propagation  loss  vs  range. 


Figure  4.  Comparison  of  observed  and  RPO 
predicted  propagation  loss  vs  time. 
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propagation  models,  including  waveguide  and 
PE  models  [1].  Figure  6  is  an  example  of 
propagation  factor  vs.  height  at  a  range  of  185 
km  for  a  3000  MHz  transmitter  at  30.5  m  and 
a  homogeneous  465  m  surface-based  duct.  The 
RPO  output  is  indistinguishable  from 
waveguide  and  PE  model  results.  The 
boundaries  between  the  RPO  submodels  are 
shown  by  the  horizontal  dashed  lines:  PE 
(parabolic  equation),  XO  (extended  optics),  and 
RO  (ray  optics).  The  geometry  is  such  that  the 
FE  (flat  earth)  model  region  is  higher  than  the 
ordinate  scale.  For  reference,  the  standard 
atmosphere  case  is  indicated  by  the  dash-dot 
curve. 

The  RPO  troposcatter  model  [7]  was 
compared  to  the  empirical  troposcatter  model 
currently  used  in  EREPS.  EREPS  is  a  PC- 
based  set  of  software  programs  based  on  the 
propagation  models  that  are  currently  in 
operational  use  in  the  U.S.  Navy  [10].  Figure 
5  shows  that  RPO  agrees  with  the  EREPS 
model;  numerous  other  comparisons  resulted  in 
differences  between  the  two  models  of  no  more 
than  a  few  decibels.  Differences  between  the 
two  models  may  occur  for  some  frequencies  in 
the  more  poleward  regions  of  the  world’s 
oceans.  TTie  empirical  EREPS  troposcatter 
model  is  a  function  of  N„  surface  refractivity, 


RPO  Waveguide  PE 


Figure  6.  Propagation  factor  vs  height  at  185  km 
downrange  as  predicted  by  RPO,  a  waveguide 
model,  and  a  PE  model  for  a  homogeneous  surface- 
based  duct. 


which  decreases  poleward.  RPO  uses  an 

effective  median  structure  parameter  (C„^  profile  to  generate  small  refractive  index  fluctuations  that 
are  added  to  the  refractive  index  value  at  each  vertical  point  in  the  PE  submodel.  The  effective 
median  structure  parameter  profile  is  applied  worldwide.  A  more  meteorologically  rigorous  method 
would  use  observed  C/  profiles  or  profiles  that  vary  with  air  mass.  However,  neither  method  is 
feasible  in  an  operational  model  because  of  lack  of  data. 


IV.  MODEL  CONSISTENCY 

The  numerical  implementation  of  RPO  was  tested  for  physical  consistency  in  parametric  studies 
that  showed  RPO  demonstrated  reciprocity  in  range-dependent  evaporation  ducting  and 
surface/elevated  ducting  environments  [11-12].  Reciprocity  requires  signal  levels  at  the  terminals  to 
be  equal  for  both  directions  of  propagation  no  matter  what  the  intervening  media.  Possible 
reciprocity-breaking  mechanisms  in  RPO  include  the  differing  split-step  PE  range  step  for  opposing 
directions  of  propagation  and  sub-model  boundary  crossings. 

Figure  7  shows  coverage  along  a  radial  reciprocal  to  Figure  2  to  a  range  of  230  km  for  a  520 
MHz  transmitter  at  50  m  above  mean  sea  level.  The  coverage  patterns  between  the  two  figures  are 
considerably  different  downrange.  Figure  8  shows  propagation  loss  vs  range  (extracted  from  Figure 
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Figure  7.  RPO  predicted  propagation  loss  on  a  reciprocal  path  to  that  of  Figure  2. 

2)  at  an  altitude  of  50  m  for  a  30,5  m 
transmitter  at  Point  Loma  radiating  towards 
Guadalupe  Isle.  Also  shown  is  propagation 
loss  vs  range  (extracted  from  Figure  7)  at  an 
altitude  of  30.5  m  for  a  50  m  transmitter  on  a 
reciprocal  path.  Propagation  loss  varied  by  up 
to  40  dB.  Reciprocity  is  demonstrated  by  the 
nearly  equal  propagation  loss  values  at  the  230 
km  terminal  ranges. 

V.  CONCLUSIONS 
RPO  has  been  scientifically  validated  by 
comparison  to  experimental  data  and  other 
high-fidelity  models.  In  addition,  the  numerical 
implementation  has  been  tested  for  internal 

consistency.  RPO  has  been  shown  to  be  far  Figure  8.  Propagation  loss  vs  range  at  50  m  and 
faster  than  pure  PE  techniques  for  typical  radar  30.5  m  altitude  for  a  30.5  m  transmitter  and  a  50  m 
coverage  assessments  required  by  the  U.S.  transmitter  respectively. 
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Navy  (e.g.,  3000  MHz,  50,000  feet  in  altitude,  and  300  nmi  in  range). 
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§1.0  Introduction 

In  practice,  the  actual  performance  of  a  communication  or  radar  system  operating  in  the  VHF-EIIF  band 
(30MIIz-100GHz)  and  having  propagation  paths  beyond-the-line-of-sight  is  often  quite  different  from  the  charac¬ 
teristics  predicted  based  upon  frec-space  propagation  physics.  The  free-space  detection  ranges  are  often  several 
orders  of  magnitude  different  from  those  observed  in  the  atmosphere.  Some  of  the  reasons  for  this  discrepancy  are: 
l)The  earth’s  surface  is  irregular  and  a  finite  conductor;  scattering  and  reflecting  incident  energy  in  various  direc¬ 
tions.  This  leads  to  depolarization  of  the  scattered  field  and  creates  complicated  spatial  interference  patterns  due 
to  multi-pathing.  2)The  curved  earth  casts  a  shadow,  creating  a  “hole”  for  low  altitude  transhorizon  propagation 
and  irregular  surface  terrain  gives  rise  to  diffraction  phenomena.  3)  Spatial  and  temporal  inhomogeneities  in  the 
atmospheric  index  of  refraction  cause  significant  ducting  and  refraction  of  radio  wave  energy.  4)  Rough  surface 
attenuation  and  volume  absorption  mechanisms  attenuate  the  wave  fields.  These  “anomalous”  propagation  effects 
are  usually  associated  with  all  practical  or  real-world  electromagnetic  problems.  To  adequately  address  anomalous 
propagation,  a  model  must  incorporate  full-wave  propagation  physics  and  range-dependent  environmental  inputs. 

Various  methods  have  been  adopted  to  deal  with  anomalous  propagation  and  they  may  loosely  be  grouped  into 
three  categories:  empirical,  analytical  and  numerical,  Empirical  techniques  rely  on  direct  measurements  which 
limits  their  region  of  extrapolation.  Analytical  methods  for  solving  the  wave  equation,  such  as  separation  of  vari¬ 
ables  or  the  geometrical  theory  of  diffraction,  while  often  exact,  suffer  from  being  restricted  to  certain  geometries 
or  boundary  conditions.  Numerical  methods,  which  attempt  solution  of  the  exact  Maxwell  field  equations  or 
equations  derived  from  them,  are  particularly  useful  for  solving  complicated  three-dimensional  problems  using  a 
computer.  The  down  side  of  numerical  methods  is  the  often  unacceptable  computational  burden  associated  with 
them,  particularly  for  full  elliptic  wave  solvers.  However,  for  tropospheric  propagation,  it  is  often  correct  that  the 
electromagnetic  energy  flux  is  dominated  by  forward  angle  multiple  scattering  physics  in  which  case  the  elliptic 
wave  equations  for  the  electromagnetic  fields  may  be  approximated  by  simpler  parabolic  wave  equations. 

In  1946  Leontovich  and  Fock^  applied  the  parabolic  wave  equation  (PE)  to  the  problem  of  transhorizon  radio 
wave  propagation  above  a  spherical  earth,  thereby  making  a  breakthrough  in  electromagnetic  wave  propagation 
modeling.  Approximately  30  years  passed  before  a  practical  algorithm  for  solving  the  Leontovich-Fock  parabolic 
wave  equation  was  developed.  In  1973,  Hardin  and  Tappert^  developed  the  split-step  Fourier  PE  (SSFPE) 
algorithm.  The  SSFPE  algorithm  exploited  advances  in  computer  hardware  and  the  development  of  the  fast 
Fourier  transform  (FFT)  algorithm  to  yield  an  efficient  numerical  solution  to  the  Leontovich-Fock  parabolic  wave 
equation.^  In  the  80’s,  the  SSFPE  method  was  applied  to  tropospheric  radar  propagation  by  several  researchers.'*"^ 
These  treatments  modeled  the  earth’s  surface  as  smooth. 

For  propagation  over  irregular  surfaces,  an  integral  equation  method  had  been  developed  by  Hufford®  and  later 
extended  by  Ott.®  In  1991  the  SSFPE  algorithm  was  extended  to  include  irregular,  inhomogeneous  terrain  by 
Ryan**^  and  forms  the  basis  of  the  VTRPE  (  Fariable  Terrain  iJadio  Parabolic  Equation)  model.  The  remainder  of 
the  paper  reviews  the  electromagnetic  SSFPE  algorithm  and  covers  the  propagation  physics  in  the  vtrpe  model. 
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§2.0  Parabolic  Wave  Equation 


A  quick  review  will  now  be  given  of  the  variable  terrain  electromagnetic  parabolic  wave  equation  derivation. 
Further  details  are  available  in  Ryan.'*^  For  monochromatic  radiation  {implicit  time-dependence  exp(-iw0i  with 
tjj  the  radian  frequency)  the  electric  E  and  magnetic  H  radiation  fields  are  solutions  to  second  order  vector 
equations  obtained  from  Maxwell’s  equations.  In  rationalized  mks  units,  the  magnetic  field  vector  H  solves 

V^H(r)  +  X  V  X  H(r)  -I-  I:^e(r)H(r)  =  0  ,  (1) 

£(r) 

with  itg  =  w/c  the  vacuum  wave  number,  while  the  electric  field  vector  E(r)  is  a  solution  of: 

V2E(r)  +  V  ^E(r)  ■  =  0  .  (2) 

The  propagation  medium  electrical  properties  are  specified  via  the  complex  relative  dielectric  constant  e(r).  For 
nonionized  media,  the  dielectric  constant  is  £-(r)  =  c'(r)  -|-  iV(r)/(wfo),  where  Cq  the  vacuum  dielectric  constant 
<7  is  the  medium  conductivity,  and  e'  is  the  usual  permittivity  of  the  medium.  Choose  a  spherical,  earth-centered 
coordinate  system  r  =  (r,0,<6),  with  respective  unit  vectors  and  let  <;6  =  0  be  the  meridian  plane 

containing  the  source  and  observation  point. 

Now  Eqs.  (1-2)  are  vector  equations,  the  components  of  which  are  coupled  together  due  to  the  presence  of  the 
V£:(r)  term.  However,  these  vector  equations  can  be  replaced  by  simpler  scalar  equations  for  the  azimuthal  fields 
if  two  conditions  are  met.  First,  the  transmitter  emits  radiation  that  is  linearly  polarized—  i.e.,  the  electrK  field 
vector  has  nonzero  components  lying  either  wholly  within  (vertical  polarization)  or  perpendicular  to  (horizontal 
polarization)  the  meridian  plane  containing  the  source  and  observation  point.  Second,  the  transverse  gradients 
in  £  are  small:  ■  V£(r)  0.  Fortunately,  this  is  the  case  for  tropospheric  propagation  involving  many  types 

of  communication  and  radar  systems.  If  these  two  approximations  are  made,  then  the  vector  wave  equations  for 
the  full  E  or  H  fields  can  be  replaced  by  simpler  scalar  equations  for  the  non-zero  azimuthal  .^-component.  The 
other  field  components  can  then  be  determined  by  straight  forward  application  of  the  Maxwell  curl  relations.  For 
vertical  polarization,  the  azimuthal  magnetic  field  component  H(r)  =  will  then  satisfy 


£  d  ,  £  d  shxedH^  ^  I  cotgc>£] 

while  for  horizontal  polarization,  the  azimuthal  electric  field  component  E(r)  =  E^{r)e^  satisfies 


(3) 


idHrE^) 

r  dr'^ 


IdO^ 


de 


1 


sin^  6 


E^  =  0. 


(4) 


For  numerical  calculations,  it  is  beneficial  to  transform  Eqs.  (3-4)  to  a  cartesian  coordinate  system  x  =  {x,z) 
by  means  of  the  earth  flattening  transformation:  x  =  a9,  z  =  aln(l  +h/a),  where  a  is  the  effective  earth  radius, 
and  h  =  r  -  a  is  the  local  altitude.  Applying  this  earth  flattening  transformation  to  Eqs.  (3-4)  yields  a  cartesian 
Helmholtz  equation  suitable  for  numerical  work 

The  new  dependent  variable  w  is  defined  as  (subscripts  {h,v]  denote  horizontal  and  vertical  polarization  respec¬ 
tively) 


tu;,  =  Vrsin 


meE^ir)  =  ^/asin{x/a)e+^A2'')E,i(x), 


(6) 
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and  m  is  the  modified  index  of  refraction  m{x,z)  S  n(r)(l  +  h/a).  An  “effective”  wave  number  k{x,z)  is  also 
defined  by 


,.2  ,.2_2  3sec2(i/a) 

- - 


and 


-kl- 


cot(j:/o)  dn 
an  dx 


5i2  5^2 


dn  * 
adz 


To  solve  Eq.  (5)  requires  an  initial  condition  plus  boundary  conditions  to  be  met.  The  initial  condition  is 
taken  to  be  a  unit  strength  point  dipole.  The  boundary  conditions  are:  1)  a  Sommerfeld  radiation  type  boundary 

condition  at  infinity  linv_.^  r  ,  where  A  denotes  or  £7^;  and  2)  continuity  of  the  tangential 

electric  and  magnetic  fields  at  the  earth’s  surface  r  =  a  .  Continuity  of  tangential  field  components  is  achieved  by 
modeling  the  earth  as  a  locally  homogeneous  dielectric  with  finite  conductivity  and  specifying  a  surface  boundary 
condition:’^ 


dz 

dtv^,{x,z) 


dm{x,  z)  I 


1 


1 


2a  m(2;,0)  dz 


+ 


(7) 


with  Cj  the  complex  dielectric  constant  of  the  earth’s  surface. 

In  practical  system  applications,  the  actual  electric  or  magnetic  fields  are  not  directly  used.  Instead,  to 
systematically  incorporate  propagation  effects  and  antenna  characteristics  in  system  performance  calculations,  the 
generalized  radar  transmission  equation  is  often  employed.'^  The  radar  transmission  equation  relates  the  received 
power  (for  a  receiver  with  antenna  gain  G^)  to  the  transmitted  power  (with  transmitter  antenna  gain  GJ 
F  1  ^ 

,  where  li  is  the  transmitter-to-rcceiver  distance,  and  F  is  the  pattern  propagation 

,  and 


2k^R 


by  P,(r)  =  P,G,G, 

L^/lQ/LJ 

1  FJr) 

factor.  By  convention,  F  is  normalized  to  the  field  of  a  unit-strength  point  dipole  source:  P^(r)  =  ' - 


[7/ed(r) 


,  where  is  the  electric  field  from  a  vertical  magnetic  dipole,  and  is  the  magnetic  field 


of  a  vertical  electric  dipole.  Employing  the  dipole  fields,  the  propagation  factor  for  horizontal  polarization  Ff^  is 
computed  as^^ 


F^(r)  = 


dTT  |u>;^(r)|p2 


l  +  (£o/?)- 


A'Ky/x 


(l  +  (^  - 


Similarly,  the  propagation  factor  for  vertical  polarization  F^  is 


n(r)  = 


l  +  {koR)  2  ^  ~  l^^|m(x)u;^(x)|  (l  +  (z  -  2o)V^^)  . 


(rsin  Oy 


3/2 


If  we  define  the  general  PE  (GPE)  operator  Q(x)  by  Q(x)  =  <,J +  £2(x),  then  Eq.  (5)  can  be  expressed  in 
the  equivalent  factored  form 


dx 


^  +  (  ^“*<5  )  w(x)-M’  |^,Q|  tn(x)  =  0. 


dx  ’ 


(The  notation  [P,  G]  =  FG  —  GF  is  called  the  commutator  of  the  operators  F  and  G) 

Now  for  range-independent  propagation  the  commutator  =  0,  and  the  equation  satisfied  by  the  out¬ 

wardly  propagating  waves  is  just 

5iu(x) 


dx 


=  f<?(x)u;(x). 


(8) 


Solving  Eq.  (8)  is  equivalent  to  solving  range-independent  Helmholtz  equation  Eq.  (5),  and  is  the  most  general 
PE  that  is  exact  for  range  independent  media.  Following  Tappert,^  a  parabolic  wave  equation  with  Q  defined 
above  will  be  denoted  as  the  general  PE  (GPE)  and  the  Q  operator  will  be  denoted  as  the  GPE  propagator. 
GPE  is  the  most  complete  PE  that  is  evolutionary  in  range  and  neglects  back  scattering.  For  range-independent 
environments,  it  is  exact  within  the  limits  of  the  far-field  approximation,  and  is  the  starting  point  for  all  numerical 
PE  algorithms.  In  following  sections,  the  computational  techniques  used  to  solve  the  PE  will  be  derived. 


818 


§3.0  Split-Step  PE 


The  outgoing  wave  solution  of  Eq.  (8)  is  now  obtained  using  the  split-step  PE  algorithm.  First,  for  numerical 
stability,  a  rapidly  varying  phasor  component  is  first  removed  from  w  via  the  envelope  transformation  u;(x)  = 
with  the  reference  wave  number.  The  complex  PE  envelope  function  4>  then  solves 

=  iQ{x)i>{x)  ,  (9) 

where  the  pseudo-differential  operator  Q  is  now  Q(x)  =  y/d'^fdz'^  +  P(x)  -  kj.^j.  Next,  the  wide-angle  PE 
(WAPE)  approximation  to  Q  is  made  by  decomposing  it  into  the  sum  of  simpler  local  operators  Q{x)  ss  A(z)  -|- 
B(x),  where  A{z)  =  +  d^/dz^  -  and  fl(x)  =  *{io  +  Ax/2,  z)  -  k^^j.  Given  an  initial  condition 

ip{xQ,  z),  a  formal  solution  to  Eq.  (9)  can  now  be  written  in  exponential  operator  form  as 

tp{xQ  +  Ax,z)  «  rl^ixQ,  z),  (10) 

Finally,  the  Trotter  product  formula  is  used  to  symmetrically  factor  the  exponential  operator  in  Eq.  (10)  into 
the  product  of  simpler  operators;  a  “kinetic  energy”  propagator  A'(Ai)  =  e+*^^-^(*)j  and  a  potential  energy 
phase  correction  (7(Ar)  =  yielding  the  split-step  PE  (SSPE)  algorithm: 

ip{x  +  Ax,  z)  =  ;\'(^)  C/(Ax)  /\(^)  ip{x,  z)  +  O(Ar^) .  (11) 

The  error  term  in  Eq.  (11)  arises  from  the  non  commutation  of  A  and  A  as  well  as  from  any  r-dependence  of 

the  refraction  operator  B.  Once  a  starting  field  is  specified,  the  above  expression  is  suitable  for  generating  a 
numerical  solution  by  repeated  application  of  Eq.  (11).  dhis  numerical  solution  is  unconditionally  stable  due  to 
the  unitary  nature  of  the  operators  K  and  U . 

Now  the  operation  z)  =  ^  ^  u;(r„,  z)  is  equivalent  to  solving  the  free- 

[q2  q2  1 

^  -I-  ^  u;(a;,z)  =  0,  with  an  initial  condition.  Thus  the  SSPE 

algorithm  amounts  to:  a  half-step  of  free-space  propagation  with  the  kinetic  energy  operator,  A(Ax/2)i  a  phase 

correction  operation,  U{Ax),  to  account  for  potential  refractive  effects  not  included  in  A';  and  finally  another 
half-step  of  free-space  propagation,  A'(Ax/2). 

The  potential  energy'  phase  correction  operator  U  is  simply  a  multiplicative  operation  on  the  PE  wave  function 
and  is  easily  implemented  numerically.  The  kinetic  energy  propagator  A,  however,  is  more  complex,  due  to  the 
pseudo-differential  nature  of  A.  To  implement  the  SSPE  algorithm  requires  working  in  a  basis  which  diagonalizes 
A.  One  such  basis  is  the  vertical  wave  number  Fourier  basis,  in  which  A  is  just  a  c-number:  A(p)  =  T[A{z)]  — 
yjk^^j  -  p2  _  k^^j,  where  T  is  the  z-space  Fourier  transform.  The  transform  variable  p  may  be  associated  with 
a  vertical  wave  number  via  p  =  k^sinO,  where  9  is  the  local  propagation  angle  with  respect  to  the  horizontal. 
Thus,  the  kinetic  energy  operator  Krp  is  evaluated  as 

I<{Ax)ip(x,  z)  =  ip{x,z)  =  |f;«AxA(p)  J  (12) 


To  accommodate  a  perfect  conductor  surface  boundary  condition  at  z  =  0,  image  techniques  may  be  used 
in  the  SSFPE  algorithm  and  odd  parity  (horizontal  polarization)  or  even  parity  (vertical  polarization)  solutions 
obtained.  The  kinetic  energy  operator  remains  unchanged  in  this  case,  allowing  the  exponential  Fourier  transforms 
to  be  replaced  by  Fourier  sine  or  cosine  transforms  over  the  half-line.  For  mixed  surface  boundary  conditions  such 
as  Eq  (7)  however,  the  kinetic  energy  operator  A  in  Eq.  (12)  must  be  modified. 

5u;(x,0)  _ 

To  determine  the  form  of  A  appropriate  for  a  surface  impedance  boundary  condition  of  the  form  — —  - 
/3tn(i,0),  the  Helrnhoitz-Kirchoff  formula  is  used  to  solve  for  the  field  u>(x)  for  x  >  Xq  in  terms  of  an  aperture 
field  ti;(xo)  and  the  normal  derivative  of  Green’s  function  G 


/■°°  .  ,dG{x,z\xQ,ZQ) 

w{x,z)  =  w(xq,Zq) - - dzQ. 


(13) 
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^  =  — <5(x  -  Xq),  and 

dG 

satisfies  the  surface  impedance  boundary  condition  -r—  =  0G  at  2  =  0  is’"* 

oz 


G(x,Xo) = 


2i£^  j-|r  -  zolv/t^  +  /32  -  0{z  +  ^o) 


9{m 


±  -  ro|yP-p2  +  ,>Zo  r  -tpz  ^  !^^^  +  ip2l  ± _ _  (14) 

7-00  I  ip  +  /?  /  \/A:2  —  p2 

where  0  is  the  Heaviside  step  function:  0(x)  =  l,x  >  0  ;^{i)  =  0,x  <  0.  Substituting  Eq.  (14)  in  the  Helmholtz- 
Kirchoff  formula  Eq,  (13)  and  replacing  w  by  the  PE  wave  function  rp  yields^^ 

V.(io  +  z)  =  -  *)  -  dt  »(»/?) 

)[pcosp,  +  /?sinp,l<i«.  (15) 

Jo  P  +  P  Jo 

The  term  in  Eq.  (15)  containing  the  Heaviside  step  function  (which  is  present  only  for  vertical  polarization) 
represents  a  surface  wave  (i.e.,  the  discrete  spectrum)  traveling  in  the  x  direction  and  decaying  exponentially 
away  from  the  interface,  while  the  integral  term  represents  the  continuous  spectrum  of  incident  and  reflected 
waves  from  the  surface. 

If  the  modified  Fourier  transform  Tp  and  the  corresponding  inverse  transform  are  defined  by^® 

p  cos  pz  +  /Jsinpz 


yoo  2  T 

J^I3W']=  4>{x)\p  cos  pz  +  u  sin  pz]dz  ,  =  -  ^(p)- 

Jo  ^  JO 


-dp, 


p2+/?2 

then  the  kinetic  energy  propagator  K  for  a  mixed  (Robin)  type  surface  boundary  condition,  that  is  the  extension 
of  Eq.  (12),  can  be  written  compactly  as 

K(Ax)r/>(xo,  z)  =  V’(^o.  [Vi(xo,  2)]} 

^20e^Axi^/k^TW-~k)-0z 


-f 


i>{xoJ)c-^^dt8{m-  (16) 


Note  that  the  potential  propagator  U  remains  the  same. 

Since  the  SSFPE  method  is  a  marching  algorithm,  a  starting  field  is  required.  Very  close  to  the  source, 
the  propagation  medium  is  assumed  to  be  approximated  as  plane-stratified  and  is  modeled  by  two  semi  infinite 
dielectric  half-spaces  having  complex  dielectric  constants  £:j,2  >  0  and  €2,^:  <  0.  The  initial  PE  field  V’o(^)  Is 
assumed  to  be  due  to  a  finite  aperture  source  distribution  5(2)  located  at  the  origin  i  =  0  in  the  half-space  2  >  0. 
The  starter  field  u;(x,2)  due  to  this  aperture  is  computed  using  a  scalar  Green’s  function  G{x,z,z*)  as 


L'(x,  2)  =  J  s(2^)G(x,  2,  2^) 


(17) 


The  Green’s  function  G  has  a  spectral  representation 


Ch..P.  ^  {e+^l  I"  -  *''l  + 

xTC  dooe”'  ^ 


-JK.  (2  -I-  2 


where  P^j  ^  is  the  plane-wave  surface  reflection  coefficient  (including  surface  roughness),  and  Kj  =  \/k'^  —  If 
the  source  aperture  distribution  has  the  form 

s(0  =  s(t-2o)e’<'-^»)P‘’ 

corresponding  to  an  antenna  located  at  height  Zq  and  vertically  steered  to  pg  =  Ih^n  the  initial  PE  field 

•00(2)  is 

i>Qi^)  =  ^  J  {e~'P^o/(p-Po)  + (16) 

where  /(p)  is  the  antenna  pattern  corresponding  to  s.  This  PE  starting  field  displays  the  correct  singular  behavior 
as  p  — *  t. 
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§4.0  Numerical  Implementation 


To  implement  the  SSFPE  algorithm,  the  infinite  Fourier  transforms  in  Eq.  (12)  or  Eq.  (16)  are  replaced  by 
discrete  sine  or  cosine  transforms  over  the  finite  interval  0  <  z  <  These  are  evaluated  numerically  using 

fast  Fourier  sine  and  cosine  transform  algorithms.  This  requires  the  PE  wave  function  rl>  have  compact  support 
and  be  band  limited  in  vertical  wave  number  space.  The  band  limiting  in  p-space  is  accomplished  by  applying 
a  low-pass  filter  as  part  of  the  kinetic  energy  propagator  step  to  remove  high  momentum  components  of  'P(0,p) 
before  transforming  back  to  z-space.  This  wave  number  filtering  is  critical,  particularly  when  propagating  over 
variable  terrain,  since  propagating  energy  can  be  slope  converted  to  steeper  angles  (i.e.,  higher  p  values)  by  the 
sloping  terrain. 

Another  potential  source  of  numerical  difficulty  in  using  the  SSFPE  method  arises  in  treatment  of  the  Sommer- 
feld  outgoing  wave  radiation  boundary  condition.  Since  the  SSFPE  algorithm  employs  a  FFT,  the  implementation 
of  a  radiation- type  boundary  condition  is  complicated.  Truncation  of  the  infinite  z-domain  down  to  a  finite  interval 
will  introduce  spurious  discrete  standing  wave  solutions  (modes)  in  the  vertical.  In  effect,  the  terminal  impedance 
at  the  end  of  the  transform  grid  z  =  is  not  properly  matched  to  the  radiation  boundary  condition. 

To  circumvent  this  problem  and  attenuate  these  spurious  modes  introduced  by  the  finite  Fourier  transforms  and 
also  to  prevent  FFT  “wrap  around”,  a  complex  absorber  potential  or  sponge,  Kaf„(z),  is  added  to  the  split-step 
B  operator:  B{x,  z)  =>  B{x,z)  +  iV^tsi^)-  The  specific  functional  form  chosen  for  the  complex  absorber  is 

Vabsi^)  =  V'osech2[(z  -  Z^^^)/wo]  .  (19) 

For  each  PE  step,  this  will  lead  to  an  effective  z-space  low-pass  filter  of  the  form  that  is  applied  to 

the  upper  part  of  the  PE  grid.  The  sponge  parameters  {Vq,  uiq}  are  determined  parametrically  by  minimizing  the 
transmissivity  and  reflectivity  from  an  equivalent  quantum  mechanical  problem  of  free-particle  scattering  ofi"  a 
complex  potential  barrier  having  the  form  Eq.  (19)  and  are  function  of  frequency,^  The  “physical  propagation 
region  in  the  VTRPE  model  thus  extends  to  an  altitude  z  =  <  Z^^  to  allow  room  for  the  complex  absorber. 

Once  the  FFT  grid  is  defined,  the  vertical  mesh  spacing  is  set,  but  the  PE  range  step  Ar  remains  unspecified. 
Each  application  of  the  SSFPE  algorithm,  Eq.  (11),  leads  to  a  local  truncation  error  in  the  solution  i?  that 
is  proportional  to  the  cube  of  the  PE  range  step  Ax.  Since  many  range  steps  are  typically  taken,  these  can 
accumulate  and  produce  unacceptable  errors  in  the  final  solution.  To  prevent  this  from  happening,  the  VTRPE 
code  performs  a  global  error  estimate  based  upon  a  detailed  analysis  of  the  local  SSFPE  truncation  error.  To 

1  u 

bound  the  total  global  error  in  the  solution  i/>,  the  PE  range  step  Ax  is  chosen  so  that  Ax  <  where 

is  a  specified  local  relative  error  tolerance  in  lA,  and  the  PE  wavefunction  error  norms  are  defined  as 


dz'^  dz 


4  dz^ 


dz  +  V'* 


dtp  b'^V 
dz  dz'^ 


Here  V{z)  =  m^(z)  -  1  is  the  “effective  potential”  related  to  the  modified  index  of  refraction  m.  As  the  VTRPE 
code  advances  the  field,  the  local  error  budget  is  monitored  and  the  range  step-size  Ax  is  dynamically  adjusted 
to  keep  the  local  error  below  a  preset  threshold. 


§5.0  Examples 


An  examples  of  vtrpe  model  output  is  shown  in  Figure  1.  This  is  a  screen  dump  from  the  PC  version  of  the 
model  and  depicts  a  coverage  diagram  of  path  loss  PL  =  20  log(2I:oK)-20  log  \F\ in  decibels.  Figure-1  represents  a 
250m  elevated  transmitter  over  a  240m  slope  with  three  ridges  down  range.  The  transmitter  frequency  is  2.5GHz, 
with  a  2-deg  vertical  beamwidth  antenna  pattern.  The  polarization  is  horizontal.  Surface  dielectric  properties  are 
characteristic  of  dry  ground,  and  the  solid  dark  line  is  the  terrain  profile.  The  atmospheric  refractivity  profile  is 
a  300m  surface  based  duct. 
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§6.0  Summary 


This  paper  describes  the  basic  physics  and  numerical  techniques  used  in  implementing  a  variable  terrain 
electromagnetic  parabolic  wave  equation  propagation  model.  This  propagation  model  allows  for  both  finite  surface 
conductivity  and  variable  (i.e.,  highly  irregular)  surface  terrain.  The  model  is  based  a  novel  implementation  of  the 
split-step  Fourier  PE  algorithm  to  efficiently  compute  the  electromagnetic  radiation  fields  for  surface  impedance 
boundary  conditions. 

The  propagation  model  is  implemented  as  the  VTRPE  (variable  terrain  radio  parabolic  equation)  computer  code 
and  is  used  in  the  prediction  of  microwave  propagation  in  complex  real  world  environments.  The  VTRPE  code 
has  the  following  characteristics; 

(1)  full-wave  propagation  physics  (i.e.,  field  amplitude  and  phase  are  computed); 

(2)  direct  solution  of  electromagnetic  fields; 

(3)  exact  treatment  of  refraction  and  diffraction  phenomena; 

(4)  exact  treatment  of  multipath  phenomena; 

(5)  range-dependent  atmospheric  refractivity  inputs,  A'^(z,r); 

(6)  infinite  or  finite  conductivity  surface  boundary  conditions; 

(7)  linear  transmitter  field  polarization  (vertical  or  horizontal); 

(8)  variable  surface  terrain  elevation  and  surface  dielectric  properties; 

(9)  frequency  dependent  atmospheric  attenuation; 

(10)  frequency  range:  ss  0.01  — +  100  GHz; 

(11)  generalized  transmitter  radiation  patterns; 

(12)  arbitrary  transmitter/receiver  geometry; 

(13)  automatic  selection  of  the  range  step-size  and  FFT  transform  size;  and 

(14)  automatic  monitoring  of  solution  global  error. 

The  VTRPE  model  properly  accounts  for  the  dominant  mechanisms  governing  tropospheric  microwave  prop¬ 
agation,  including  the  effects  of  anomalous  propagation  arising  from  spatial  changes  in  atmospheric  refractivity 
and  variable  terrain  features. 

This  work  was  supported  under  the  NRaD  Independent  Research  program. 
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FIG.  1.  Multiple  ridge  line  propagation  with  300m  surface  based  duct 
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1.  Introduction 

Shipboard  radar  and  communication  systems  in  a  coastal  marine  environment  are  very  sensitive  to 
vertical  and  horizontal  variations  in  the  tropospheric  refractivity  conditions.  These  systems  frequently 
experience  extended  or  reduced  detection  ranges,  inaccurate  altitude  estimates,  and  increased  surface 
clutter  due  to  the  presence  of  nonstandard  atmospheric  refractivity  conditions  (e.g.,  surface  or  elevated 
ducts)  and  their  concomitant  impact  on  radio  wave  propagation.  If  synoptic  range  dependent  refractivity 
fields  were  available  in  a  timely  manner  aboard  a  ship,  an  automated  decision  aid  could  be  designed  to 
adjust  the  radar  output  to  account  for  these  anomalous  propagation  effects.  In  fact,  a  decision  aid  has 
been  developed  by  The  Applied  Physics  Laboratory  for  use  on  AEGIS  ships,  but  its  use  is  limited  by  the 
inadequacy  of  synoptic  refractivity  measurements. 

In  most  Naval  battle  scenarios  a  cruiser  or  destroyer  is  usually  in  communication  with  other  friendly- 
ships  and  planes  or  helicopters,  whose  positions  and  velocities  arc  known  very  accurately.  This  paper 
will  describe  a  technique  for  supplementing  available  refractivity  data  in  situ,  by  letting  the  on  board 
microwave  sensors  receive  a  set  of  known  signals  transmitted  at  multiple  frequencies  from  one  or  more 
of  these  friendly  vehicles.  The  refractivity  profile  estimation,  or  remote  sensing,  is  accomplished  using 
matched  field  processing  methods  whereby  a  modeled  signal  waveform  (rephea)  is  cross  correlated  with 
measured  sensor  data  to  yield  a  tomographic  reconstruction  of  the  medium. 

The  proposed  remote  sensing  algorithm  uses  a  nonlinear  Gauss-Markov  estimation  technique,  imple¬ 
mented  numerically  with  a  finite  difference  Levenberg-Marquardt  procedure.  At  each  step  of  the  iteration, 
a  parabohe  wave  equation  (PE)  model  is  used  to  compute  the  forward  solution  or  rephea  used  in  the 
matched  field  processor.  The  advantage  of  this  proposed  technique  is  that  it  offers  the  ])ossibihty  of  rapid 
sensing  of  atmospheric  refractivity  fields  using  readily  available  hardware  assets. 

This  paper  addresses  the  initial  phase  of  the  work,  namely'^  formulating  the  procedure  and  establishing 
the  feasibility  of  the  method.  The  next  phase  of  the  work  will  involve  using  measured  coastal  data  from 
experiments  at  Wallops  Island,  and  The  Naval  Surface  W^arfare  Center  at  Dahlgren,  Virginia. 

2.  Atmospheric  Refractivity 

In  the  troposphere,  radio  waves  travel  along  curved  ray  paths  and  the  amount  of  bending  or  refraction  is 
proportional  to  the  gradient  of  the  refractive  index  ;i  transverse  to  the  ray  path.  At  high  altitudes,  the 
inde.x  of  refraction  n  will  approach  unity  as  the  density  decreases  and,  therefore,  will  tend  to  decrease 
with  height  in  the  atmosphere.  Since  n  is  very  close  to  unity,  it  proves  more  convenient  to  work  with  the 
refractivity^  N  defined  by  N  —  (/i  —  1)  x  10®.  For  radio  wavelengths  removed  from  gaseous  absorption 
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lines,  the  refraclivity  is  found  to  be  empirically  related  to  the  pressure  p,  temperature  1  and  humidity 
(water  vapor  pressure)  e  [1] 

N  -  77.6^  -  5.6^  +  3.75  x  10^^ 

where  the  pressure  is  measured  in  mbar  and  temperature  in  °K. 

The  variation  of  N  with  altitude  will  cause  radio  waves  to  propagate  along  curved  paths,  and  it  proves 
useful  to  analyze  this  ray  curvature  via  a  coordinate  transformation  which  maps  the  curved  earth  s  surface 
into  a  flat  one.  This  is  accomphshed  by  replacing  the  radio  refractive  index  n  by  the  modified  refractive 
index  m  =  n(l  zja),  where  a  is  the  effective  earth  radius.  In  similar  fashion,  the  modified  refractivity 
M  is  defined  by 

M (7U  -  1)  X  10®  «iV  + 0.1572 

where  2  is  height  in  meters.  Variations  of  N  with  height  caused  by  humidity  and  temperature  gradients 
cause  energy  to  be  trapped  when  the  vertical  gradient  of  M  is  non-positive. 

In  the  atmospheric  boundary  layer  (the  lower  kilometer  of  the  atmosphere),  the  vertical  profiles  of 
temperature  and  humidity  are  often  approximately  logarithmic.  [2]  Thus,  over  water  where  evaporatbn 
occurs  and  the  humidity  profile  dc'creases  with  height,  M  wiU  have  a  local  minimum  at  some  height  2  =  d 
above  the  water  surface.  If  this  height  is  sufficiently  high  in  terms  of  wavelengths,  then  radio  energy  will 
be  trapped  and  propagation  to  long  ranges  is  possible.  Over  water,  this  phenomena  is  known  as  an 
evaporation  duct.  Similar  situations  can  occur  over  land  due  to  increases  of  temperature  with  height  in 

nocturnal  radiation  inversions.  •  .  j  v.  u 

A  convenient  analytic  representation  of  evaporation  duct  type  refractivity  profiles  is  provided  by  the 

log-huear  form: 

M  (2)  —  Mo  \  9\z  —  {d  +  20)  ln(l  +  z/-2o)ji 

where  Mq  is  the  surface  refractivity  value,  9  is  the  asymptotic  gradient  in  M  at  large  heights,  d  is  the 
duct  height  (i.e.,  the  height  of  local  profile  minimum),  and  20  is  known  as  the  roughness  parameter. 
For  the  ocean  surface,  a  typical  value  for  20  is  20  ~  iQ-^meters.  The  advantage  of  an  analytic  profile 
representation,  such  as  the  log-hnear  form,  over  a  discrete  form  is  the  fewer  number  of  parameters 
needed  to  specify  the  profile.  This  is  important  when  numerical  inversion  methods  are  attempted  as  will 
be  discussed  later. 


3.  Inverse  Medium  Problem  for  Electromagnetic  Waves 

In  the  last  few  years  Colton  and  Kress|4]  have  extended  the  theory  of  inverse  scattering  of  acoustic  and 
electromagnetic  waves  to  include  the  inverse  medium  problem  for  electromagnetic  waves.  They  define  the 
inverse  medium  problem  for  electromagnetic  waves  to  be  the  determination  of  the  index  of  refraction  r!(x) 
from  a  knowledge  of  the  far  field  radiation  pattern  Eoo,  of  a  time  harmonic  solution  E(x,  t)  to  Maxwell’s 
equations  in  an  isotropic  inhomogeneous  medium.  These  ideas  are  developed  rigorously  using  classical 
and  functional  analysis  in  the  above  cited  book  by  Colton  and  Kress.  These  techniques  are  apphed  to 
the  detection  and  monitoring  of  leukemia  in  two  papers,  soon  to  be  published  by  Colton  and  Monk-jS] 
The  refractivity  estimation  problem  which  the  present  paper  addresses  is  an  inverse  medium  problem, 
but  it  is  much  more  complex  than  those  defined  by  Colton  and  Kress,  In  order  to  obtain  classical 
solutions  to  the  forward  scattering  problems,  Colton  and  Kress  insist  that  the  incident  wave  must  be 
a  homogeneous  plane  wave,  there  are  no  boundary  conditions  and  they  insist  that  the  refractive  index 
be  spherically  stratified.  Even  under  these  very  restrictive  assumptions  classical  solutions  to  inverse 
scattering  problems  for  acoustic  and  electromagnetic  waves  are  not  available,  and  it  is  necessary  to  resort 
to  numerical  technkpies.  Some  of  these  numerical  techniques  apphcablo  to  inversion  problems  for  acoustic 
waves  are  described  in  detail  in  the  book  by  Colton  and  Kress[4].  A  powerful  numerical  technique  for  the 
inverse  medium  problem  is  developed  by  Colton  and  Monklfi],  and  applied  to  a  medical  imaging  problem, 
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but  the  restrictive  assumptions  meutioned  above  are  still  necessary.  The  solution  to  the  inverse  medium 
problem  proposed  in  this  paper  uses  a  numerical  solution  to  the  forward  scattering  problem  as  well  as 
the  inverse  problem,  which  simplifies  the  mathematical  formulation  of  entire  procedure,  as  will  be  seen 
in  the  next  section. 

Inversion  problems  involving  acoustic  and  electromagnetic  waves  arise  in  a  variety  of  important  ap¬ 
plications  ranging  from  medical  imaging  to  geophysics  to  the  non  destructive  testing  of  materials  .  Since 
these  problems  involve  the  inversion  of  a  partial  differential  operator  they  generally  fall  into  the  class  of 
problems  which  are  called  ill  posed.  A  problem  in  mathematical  physics  is  called  well  posed  provided: 

1.  a  solution  should  e.xist 

2.  the  solution  should  be  unique 

3.  the  solution  should  depend  continuously  on  the  data 

otherwise  the  problem  is  called  ill  posed.  The  third  requirement  is  motivated  by  the  fact  that  in  all 
applications  the  data  will  be  measured  quantities,  and  it  is  desirable  that  small  errors  in  the  data  will  cause 
small  errors  in  the  solutions.  It  has  been  cstabhshed  that  most  all  of  the  physically  motivated  classical 
initial-boundary  value  problems  in  partial  differential  equations,  with  the  exception  of  the  backwards 
heat  conduction  equation  are  \vell  posed,  see  any  book  on  partial  differential  equations  such  as  [7], 
Such  problems  in  scattering  theory  are  usually  referred  to  as  forward  or  direct  problems.  However  most 
inverse  scattering  problems  arc  ill  posed.  Though  “iUposedness”  may  be  an  undesirable  adjective  to  a 
classical  mathematician,  in  order  to  determine  refractivity  parameters  from  the  far  field  measurements  of 
electromagnetic  path  loss  over  the  surface  of  the  earth,  it  is  necessary  that  the  problem  be  substantially 
over  determined;  so  no  solution  can  actually  exist. 

To  place  the  refractivity  estimation  problem  in  the  context  of  a  general  inversion  problem,  consider 
a  nonlinear  function  F  which  maps  a  Hilbert  space  A'  into  a  Hilbert  space  Y  .  An  inversion  problem  for 
the  operator  F  can  be  defined  as  follows:  given  an  arbitrary  y  in  Y ,  find  a  x  in  A"  such  that 


=  y  (2) 

In  order  to  solve  the  nonlinear  equation  (2),  it  is  customary  to  linearize  about  an  approximate  solution 
XQ  to  the  operator  equation 

F(xo  +  s)  -  F(xo)  t  F'(xo)s  T  o(||s||)  (3) 

and  apply  a  Newton-type  iteration  scheme.  Given  xq  G  A,  then  for  A:  =  0,  1,  •  •  ■,  set 

+  Sk  (4) 

where  G  A  is  a  solution  to 

F'{xk)sk  y  -  F{xk)  (5) 

If  XQ  is  a  good  approximate  solution  to  Eq.  (2),  and  Eq.  (5)  has  a  solution  Sf.  for  each  k,  then  {xj,  xo,  •  •  •, 
will  be  a  sequence  of  increasingly  better  approximate  solutions  to  Eq.  (2).  If  F  is  a  partial  differential 
operator  on  a  Hilbert  space  of  vector  valued  square  integrable  functions,  then  the  differential  operator  is 
converted  into  an  integral  operator,  and  hopefully  the  solution  to  the  direct  problem  may  be  expressible 
in  terms  of  a  FredhoLm-type  integral  equation.  In  some  cases  classical  solutions  to  the  direct  scattering 
problem  are  available  and  straightforward  numerical  solutions  the  inverse  problem  Eq.  (2),  without  re¬ 
sorting  to  linearization  and  Newton-type  methods.  These  cases  are  thoroughly  explored  in  Colton  and 
Kress’  book  [4].  They  also  have  developed  a  new'  technique,  called  a  dual  space  method,  which  does  not 
require  classical  closed  form  solutions  to  the  direct  problem,  but  it  does  require  that  the  direct  problem 
be  expressible  in  terms  of  an  integral  ecpiation.  Unfortunately  propagation  problems  around  the  surface  of 
the  earth  with  range  and  height  dependent  refractivity  do  not  lend  themselves  to  Predholm-type  integral 
equation  formulations.  However,  in  recent  years  the  parabofic  approximation  to  the  Helmholtz  equation 
has  been  shown  to  be  a  very  accurate  and  efficient  procedure  for  solving  the  direct  medium  problem  — 
even  under  very  complex  initial,  boundary  and  refractivity  conditions.  The  mathematical  formulation 
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presented  in  the  next  section  of  this  paper  uses  the  PE  approximation  to  solve  the  direct  problem  and 
a  very  robust  numerical  technique,  based  on  the  theory  of  nonlinear  least  squares,  to  solve  the  inverse 
medium  problem  and  obtain  estimates  of  range  varying  refractivity  parameters  based  on  measured  far 
field  propagation  path  losses. 

The  general  inversion  problem  was  stated  in  equation  Eq.  (2),  with  an  iterative  Newton-type  solution 
technique  given  by  equations  Eq.  (4)  and  Eq.  (5).  Now  in  most  applied  problems  y  is  not  in  the  range  of  F, 
because  of  measurement  and  modeling  errors,  so  no  solution  to  Eq.  (2)  exits.  Thus  it  is  desirable  to  design 
an  approximate  solution  method  which  solves  iteratively  a  sequence  of  well  posed  subproblems,  rather 
than  the  ill  posed  inversion  problem.  The  subproblems  should  satisfy  each  of  the  following  characteristics: 

1.  Each  subproblem  is  well  posed. 

2.  Each  subproblem  lends  itself  to  an  efficient  computational  solution  technique 

3.  Since  the  measured  quantity  y  is  not  likely  to  be  in  the  ranp  of  any  of  the  subproblem  operators, 
there  must  be  simple  criteria  for  determining  when  the  solution  to  a  subproblem  is  sufficiently  close 
to  y. 

In  the  functional  analysis  literature,  the  construction  of  subproblems  for  an  ill  posed  problem  is 
referred  to  as  regularization;  see  Colton  and  Kress  [4]  for  a  very  thorough  treatment  of  all  of  these  topics. 
An  algorithm  commonly  used  to  solve  the  general  inversion  problem  Eq.  (2),  called  the  Levenberg- 
Marquardt  algorithm,  is  based  on  a  “hnearize  and  regularize”  approach,  and  was  originally  derived  to 
solve  the  nonlinear  least-squares  problem,  [8]  The  resulting  subproblems  after  linearization  are:  given  an 
approximate  initial  solution  xo  to  Eq.  (2),  xo  6  A,  then  for  fc  =  0, 1,  •  •  •,  set 

Xfc+l  =  Xit  +  Sk 

where  the  parameter  yk  —  0  ^  ^  together  solve  the  subproblem 

[F'{xfc)*F'(xi^.)  +  ykl]  s  -  F'(xfc)*  {v  “ 

In  a  Hilbert  space  setting  F'(xfc)*  is  the  adjoint  of  the  Frechet  derivative  of  the  operator  F,  and  the 
Levenberg-Marquardt  procedure  is  an  example  of  a  Tikhonov  regularization  technique. 

The  refractivity  estimation  problem  presented  in  this  paper  is  an  ill  posed  inversion  problem  in  the 
general  theory  of  the  scattering  of  electromagnetic  waves.  Since  the  problem  allows  for  very  general 
initial  and  boundary  conditions  as  well  as  height  and  range  dependent  refractivity,  it  is  not  possible 
to  obtain  classical  solutions  in  terms  of  integral  equations,  even  for  the  direct  problem.  The  solution 
method  proposed  for  the  problem  involves  solving  the  direct  problem  with  a  PE  model,  which  results 
from  approximating  the  azimuth  independent  Helmholtz  equation,  ignoring  the  backscatter.  The  domain, 
which  is  the  refractivity  parameter  space,  will  be  discretized  into  a  finite  number  of  refractivity  profiles, 
each  characterized  by  a  finite  number  of  parameters.  The  output  space,  which  is  the  far  field  path  loss, 
will  be  discretized  by  transmitter  height,  receiver  height,  range  and  frequency.  In  this  formulation  the 
function  F  is  a  nonlinear  vector-valued  function,  whose  domain  and  range  are  both  finite  dimensional 
Euclidean  spaces,  and  the  derivative  operators  appearing  in  Eq.  (6)  are  the  Jacobian  matrices  of  partial 
derivatives.  Each  column  of  F'(x;t)  consists  of  the  partial  derivatives  of  a  given  path  loss  with  respect  to 
each  of  the  refractivity  parameters.  Correspondingly  each  row  of  F'(xfc)  consists  the  partial  derivatives 
of  a  each  path  loss  with  respect  to  a  given  refractivity  parameter.  These  partial  derivatives  are  computed 
using  finite  difference  approximations,  and  the  finite  differences  are  computed  with  the  PE  model.  The 
Levenberg-Marquardt  technique  Eq.  (6)  is  used  to  generate  the  subproblems  after  linearization.  Each 
subprobleni  corresponds  to  minimizing  the  sum  of  the  Euclidean  norms  of  the  residuals  between  each 
modeled  path  loss  and  each  correspondingly  discretized  measured  path.  A  new  subproblem  is  generated 
by  hnearizing  Eq.  (3)  about  the  solution  to  the  previous  subproblem.  This  technique  results  m  a  very 
robust  numerical  scheme,  because  the  singular  value  decomposition  of  the  Jacobian  matrix  can  be 
eliminate  rcxlundant  refractivity  parameters.  Then  the  solutions  to  the  subproblems  are  being  project 
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into  the  subspace  spanned  by  the  significant  parameters  for  the  given  range,  transmitter  height,  receiver 
height  and  frequency  diversity. 

The  measured  data  set  to  which  the  technique  is  being  applied  has  transmitter  and  receiver  height 
diversity  as  well  as  frequency  diversity.  Another  set  of  data  which  will  soon  be  available  also  exhibits  range 
diversity,  as  well  as  transmitter  and  receiver  height  variations,  but  it  has  hmited  frequency  diversity.  The 
range  diversity  in  the  propagation  measurements  will  add  some  interest  since  the  procedure  will  estimate 
range  dependent  refractivity  parameters.  More  details  on  the  mathematical  formulation  and  the  data 
sets  will  be  given  in  the  next  section. 

4.  Refractivity  Estimation  from  Path  Loss  Measurements 

It  is  well  known  in  the  radar  couununity  that  if  synoptic  refractivity  data  were  readily  available,  tactical 
decision  aids  could  be  developed  to  compensate  for  the  effects  of  anomalous  propagation  on  the  sensors 
and  weapon  systems.  The  question  that  this  paper  addresses  is  how  to  obtain  tropospheric  refractivity 
data,  given  that  measurements  of  the  one-way  path  loss  between  a  transmitter  and  a  receiver  are  available. 
The  question  as  to  whether  propagation  measurements  are  easy  to  obtain  over  a  wide  frequency  band  is 
being  addressed  at  The  Naval  Surface  Warfare  Center  by  Stapleton  and  Kang. [9]  Also  considerable  effort 
has  been  devoted  to  the  modeling  of  EM  wave  propagation  over  a  broad  frequency  band  with  the  PE 
model,  see  the  two  previous  papers  in  these  proceedings  (10],[11|.  The  numerical  procedure  to  be  outlined 
in  this  section  uses  a  PE  model  to  solve  the  direct  EM  medium  scattering  problem,  and  the  path  loss 
measurements  with  a  Levenberg-Marquardt  procedure  to  perform  the  inversion. 

Assume  that  there  are  li  transmitters  locations  and  I2  receivers  locations.  Though  we  can  assume 
range  diversity,  to  simplify  the  notation,  assume  a  fixed  range  between  antennae.  Also  assume  that  there 
are  k  distinct  frequencies.  The  tropospheric  refractivity  environment  is  parameterized  by  a  set  of  I  profiles 
{Pi,  P2t  ■  ■  ^Pi},  with  the  i-th  profile  Pi  defined  by  ni  parameters.  Thus  a  parameter  vector  would  be 
X  =  {p1,P2’'  ■  when  we  say  x  e  R"  we  mean  n  =  n;  I  n2  +  •  •  ■  -f  n;.  We 

can  either  use  an  analytical  representation  for  each  profile,  for  example  the  log-linear  formEcp  (1),  or  a 
piecewise  linear  or  spline  characterization.  The  advantage  of  the  former  approach  is  the  greatly  reduced 
size  of  the  parameter  space  that  must  be  searched.  Now  since  a  PE  model  outputs  a  path  loss  at  a  discrete 
set  of  receiver  heights,  for  a  given  transmitter  height  and  frequency,  the  PE  model  must  be  executed 
k  ■  /itimes  to  obtain  the  rn  k  ■  li  ■  I2  modeled  path  losses.  Thus  the  direct  medium  scattering  e<iuation 

y  -  F(x)  (7) 

denotes  a  nonlinear  vector  valued  function,  with  the  PE  model  mapping  the  refractivity  parameter  space 
in  /?^'  into  the  modeled  path  loss  space  in  Now  we  are  not  actually  invertingEq.  (7),  since  the 
measured  path  loss  vector  y  is  not  in  the  range  of  PE  model  so  we  must  find  a  x  in  the  refractivity 
parameter  space  which  minimizes  the  Euclidean  norm  of  residual  between  modeled  and  measured  path 
loss: 

K(x)=t(F(x)-yy(F(x)-y)  (8) 

The  Levenberg-Marquardt  technique  Eq.  (6)  can  now  be  written  as  find  a  >  0  and  an  x  such  that 

(jJJt+Ml)x=  Jj'(y-F^)  (9) 

where  =  J(x*^)  is  the  Jacobian  matrix  of  partial  derivatives,  approximated  by  finite  differences,  using 
PE  for  the  function  evaluations.  A  singular  value  decomposition  will  be  performed  on  the  Jacobian  matrix 
to  remove  the  redundant  parameters.  Then  fif;  can  be  set  equal  to  0  in  Eq.  (9)  and  the  much  simpler 
Gauss-Newton  method  results,  involving  only  positive  definite  Hessian  approximations  lu 

this  case  a  QR  decomposition  of  the  Hessians  results  in  a  very  efficient,  robust  implementation.  In  the 
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computer  implementation  the  modeled  and  the  measured  path  losses  are  each  divided  by  the  path  loss 
modeled  for  standard  propagation,  this  also  should  reduce  the  residuals  in  Eq.  (8)  and  force  nk  equal  to 
zero. 

The  data  set  described  in  [9)  has  sufficient  frequency  and  antennae  height  diversity  to  accurately 
estimate  50  to  100  refractivity  parameters.  And  considering  that  the  maximum  antenna  height  is  under 
30  meters,  and  the  antermae  separation  is  15  kilometers,  it  may  be  possible  to  obtain  a  good  PE  match 
to  measured  path  loss  with  as  few  as  20  parameters. 

One  final  note  is  that  if  a  measurement  covariance  matrix  R  is  available,  then  R  can  be  inserted 
into  Eq.  (8),  and  the  resulting  solution  to  Eq.  (9)  is  the  nonlinear  Gauss-Markov  or  minimum  variance 
statistical  estimate  of  the  refractivity  parameters. 
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1-  Introduction. 

Many  of  essential  features  of  L'HF/VHF  radio  wave  propagation  in  the  atmosphere  are  closely 
related  with  the  structure  of  radio  refractive  index.  If  the  vertical  gradient  of  radio  refractive  index- 
negative  and  less  then  -157  N  units/km  the  rays  bend  downward  to  the  earth  so  that  ducting  is 
possible  [10],  By  statistics  the  percent  incidence  of  ducting  conditions  over  wide  area  is  40%-  80% 
[2J.  [5].  Due  to  presence  of  ducts  it  is  possible  beyond  line-of-site  radio  wave  propagation  and 
radio  communication  at  distances  of  about  2000  -  3000  km.  The  paths  losses  of  radio  waves  in 
ducts  usually  are  connected  with  dumping  of  waves  in  the  atmosphere,  reflection  conditions  and 
geometric  factors. 

The  apj)earance  of  ducts  is  veiy  well  predictable.  Very  frequently  the  parameters  of  ducts  can 
be  modified  by  different  acoustic  and  gravity  waves  creating  periodic  structure  of  radio  refractive 
index  along  path.  Atmospheric  gravity  waves  producing  wavelike  disturbances  of  duct  layers  were 
detected  by  pressure  sensors  on  the  ground  [2].  The  behavior  of  rays  in  ducts  is  similar  to  that 
of  nonlinear  dynamical  system  and  periodic  space  disturbances  of  ducts  can  cause  catastrophic 
changes  in  behavior  of  rays.  These  effects  can  be  veiy  strong  and  more  essential  for  assessments 
of  communication  link  range  than  other  paths  losses.  The  another  factor  affecting  on  radio  wave 
propagation  is  the  shape  of  terrain  [1],  [7].  Very  often  the  shape  of  boundary  in  atmospheric  ducts 
contains  quasi  periodical  components:  waves  on  the  sea  surface,  buildings,  hills.  In  some  cases 
influence  of  the  boundary  conditions  on  rays  in  ducts  is  similar  to  the  effects  caused  by  processes 
that  periodically  modulate  refractivity  index  profile. 


2.  Refractive  index  profile. 

Radio  refractive  index  profiles  in  atmospheric  ducts  measured  by  radiosondes  or  lidars  [8].  [9]  can 
haA-e  very  complicated  structure.  Very  often  ducts  have  thin  multiple  layers  structure  [2].  For  ray 
tracing  in  ducts  can  be  used  simple  analytical  or  semi-empirical  models  of  duct  profile.  Because 
of  rapid  divergence  of  rays  in  modulated  ducts  for  assessments  of  ray  propagation  more  preferable 
are  synthesized  models  giving  more  close  approach  of  altitude  behavior  of  radio  refractive  index. 
An  example  of  output  for  such  model  is  shown  in  Figure  1,  containing  radio  refractivity  profile  in 
X  units  and  its  altitude  gradient.  To  fit  in  input  points  there  were  used  special  extrapolation  pro¬ 
cedure  with  trigonometric  smoothing  that  gives  better  results  than  usually  used  spline  methods. 
The  profile  in  Figure  I  is  a  kind  of  elevated  duct  and  is  typical  for  many  situations  [7],  [8]. 
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Fig.  1 .  Refractivity  Profile  for  Elevated  Duct  and  Altitude  Gragient 
of  Refractive  Index. 


3.  Ray  approximation. 

For  ray  tracing  in  ducts  can  be  used  approximation  of  geometric  optics.  Instead  of  usuall}’  used 
length-altitude  variables  it  is  more  convenient  for  ray  tracing  in  modulated  ducts  to  use  altitude- 
impulse  variables  [b-i.  X2)  [6],  [11].  [12]. 


-Ti  =  r  -  Re 


(3.1) 


n  r 


(3.2) 


where  xi  -  altitude  above  the  earth,  r  -  radius  of  point  on  the  ray  path,  Re  -  radius  of  the  Earth, 
n  -  radio  refractive  index.  X2  -  impulse,  X2  =  pr  =  nsino.  q  -  angle  of  ray  inclination  to  the  Earth. 
Impulse  X2  is  proportional  to  the  vertical  component  of  wave  vector.  When  .r2  is  negative  the  ray 
bends  downward  to  the  earth,  when  .r2  equals  to  zero  the  ray  is  parallel  to  the  earth  and  when 
.T2  is  negative  the  ray  propagates  outward  from  the  earth.  Refractive  index  n  is  connected  with 
refractivity  A'  by  relation  [3].  [10] 


n  =  1  +  10“‘’bV 

In  variables  (aq.  x-,)  the  ray  paths  satisfy  to  equations  [6].  [11].  [12] 

a-2  (1  + 


•^2=  +  (  1  + 
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where  .ri.2=  dxiEjdt,  i  =  ReP-  p  -  polar  angle  in  the  plane  of  ray  trace. 

Equations  (3.4).  (3.5)  are  nonlinear  Hamilton  system.  It  was  solved  numerically  and  results 
were  represented  as  phase  portrait  figures. 


4.  Ducting  in  quiet  atmosphere. 

In  the  absence  of  disturbances  the  results  of  calculations  for  refractivity  profile  shown  in  Figure 
1.  are  represented  as  corresponding  phase  portrait  in  figure  2.  Total  phase  space  consist  of 
non  intersecting  separate  curves.  Each  curve  corresponds  to  the  separate  ray  path.  Continuous 
curves  correspond  to  rays  trapped  in  elevated  ducts.  It  is  seen  that  the  duct  has  complex  thin 
structure.  It  consists  of  two  elevated  duct  subsystems  -  lower  subsystem  (curves  3  and  2)  and  upper 
subsvstem  (curves  5.  6.  7).  Curve  3  corresponds  to  the  ray  that  propagates  in  the  atmosphere 
not  touching  the  surface  of  the  earth.  Curve  2  corresponds  to  the  ray  trapped  in  the  duct  that 
periodically  reflects  from  the  surface  of  the  earth.  Altitude  gradient  of  refractivity  index  near  the 
earth  surface  is  small  (Figure  1).  So  the  rays  starting  near  the  earth  at  zero  elevation  angle  bend 
outward.  Upper  duct  subsystem  consists  of  embracing  duct  (curve  5)  and  two  nested  separate 
subducts  (curves  6  and  7)  inside.  Curves  1.  4,  8  correspond  to  untrapped  rays  that  reflect  from 
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conditions. 


along  path  refractive  index  structure. 


the  earth  or  propagate  between  lower  and  upper  duct  subsystems  and  over  upper  duct  subsystem. 
Topologically  areas  between  curves  2  (6.  5)  and  4  (7,  S)  are  separated  by  separatrix. 

For  rays  trapped  in  duct  the  area  under  corresponding  curve  in  the  phase  portrait  is  adiabatic 
invariant. 


Knowledge  of  the  ray  adiabatic  invariant  (1.1)  at  the  site  of  transmitter  allows  to  recalculate  and 
obtain  ray  characteristics  at  the  site  of  receiver  and  vice  versa  [4].  The  method  of  ray  adiabatic 
invariant  can  be  used  for  slowly  non  periodically  cha.nging  ducts.  Usage  of  the  method  of  ray 
adiabatic  invariant  for  periodically  modulated  ducts  is  very  restricted.  Ray  adiabatic  invariant  is 
constant  only  for  disturbances  that  uniformly  modif}-  the  structure  of  refract! vity  index  profile. 

5.  Ducting  in  modulated  atmosphere. 

Periodical  modulation  oLduct  parameters  destroys  ray  adiabatic  invariant.  Separate  curves  on 
phase  portrait  transform  into  phase  layers.  The  width  of  phase  la3'ers  depends  on  parameters  of 
duct  modulation.  Very  often  ducting  becomes  impossible  at  all.  Under  4%  modulation  of  duct 
altitude  with  space  wavelength  of  160  km  the  phase  portrait  represented  in  Figure  2  has  been 
transformed  into  phase  portrait  in  Figure  3.  The  ducting  area  is  significantK'  reduced  although 
altitude  gradients  of  refractivity  index  are  almost  the  same.  For  former  upper  duct  subsystem 
the  ducting  becomes  impossible.  Rays  can  be  trapped  in  this  structure  for  a  very  restricted 
time.  They  can  penetrate  from  one  subduct  to  another  but  then  they  go  off  duct  subsystem.  The 
similar  is  the  behavior  of  raj's  near  separatrix  in  lower  duct  subsv'stem.  Ducting  is  possible  onK  in 
internal  region  of  lower  elevated  duct  (curve  3  in  Figure  2  and  Figure  3).  For  another  wavelengths 
of  modulation  or  larger  amplitudes  the  leakage  of  radio  waves  from  ducts  can  be  far  more. 


6.  Ducting  over  periodical  boundary. 

To  investigate  ray  tracing  over  periodical  boundary  the  shape  of  lower  boundaiw  of  duct  was 
chosen  in  the  form 


r  -  jRe -i- a(l -h  +  U^))  (6.1) 

A 

where  A  is  space  wavelength,  a  -  amplitude,  f  =  B,e^,  v:  -  arbitrary  phase  shift.  After  mirror 
reflection  the  ray  turns  on  angle  $.  For  small  amplitudes  and  wavelengths  a  <C  Re-  X  ^  Re  ai^d 
2~af\  <C  1  the  angle  0  approximately  equals 

B  ~  ,5ma.x  C0S(  y  f  +  ij’]  (6.2) 

where  dinax  =  dTre/A.  Behavior  of  rays  over  periodical  boundary  depends  on  numerical  value  of 
parameter  £  that  equals  to  the  ratio  of  to  critical  trapping  angle  of  ra\'s  in  duct  with 
smooth  boundar}'.  £  =  ,dmax/<^'c-  For  o  ~  0.04  the  results  of  calculation  are  shown  in  Figure  4. 
Parameter  £  -C  1  and  influence  of  such  boundary  on  rays  is  not  very  large.  Only  a  ])art  of  rays  near 
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separatrix  can  not  be  trapped  in  duct.  Periodica]  boundary  can  influence  on  ducting  only  through 
rays  that  can  be  reflected  from  the  surface.  For  these  rays  curves  in  phase  portrait  can  become 
multiperiodical.  In  Figure  4  curve  1  is  three  periodical  and  curve  2  is  two  periodical.  For  such 
boundary  conditions  can  be  used  modified  method  of  ray  adiabatic  invariant,  where  integrating 
in  (4.1)  is  over  several  periods  of  ray  oscillation  in  duct.  For  surface  based  ducts  a  share  of  rays 
reflecting  from  the  boundary  is  larger  than  for  elevated  ducts  and  the  effects  are  stronger.  When 
parameter  c  increases  the  curves  in  phase  portrait  similar  to  situation  of  periodically  modulated 
refractive  index  begin  to  fill  in  separate  layers.  For  large  values  of  parameter  r  the  trapping  of 
ravs  reflecting  from  the  boundary  becomes  impossible.  The  only  exception  is  situation,  when 
space  period  T  of  ray  oscillations  in  duct  is  multiple  of  wavelength  T  —  nX,  n  =  1,2.3...  and 
3  —  0.  In  real  situations  It  is  possible  to  expect  that  only  a  narrow  beam  of  rays  can  propagates 
in  duct  over  periodical  boundary  for  enough  long  distance  for  the  case  of  large  values  of  parametcr 
5.  The  most  favorable  is  situation  when  rays  reflect  from  the  self  focusing  regions  of  boundary 
where  the  curvature  is  positive.  For  £  ~  7  the  results  of  ray  tracing  are  shown  in  Figure  -5. 
There  were  tracked  7  rays  and  the  distribution  of  maximal  length  (in  km)  of  ray  paths  in  duct 
is  represented  as  histogram  in  lower  right  corner  in  Figure  5.  Knowledge  of  similar  distribution 
allows  to  make  estimations  of  expected  field  strength  in  ducts.  The  difference  in  elevation  angles 
Aq  of  tracked  rays  was  only  1%  of  critical  trapping  angle  Aa/a,:  <  0.01.  But  the  lengths  of 
ray  paths  differs  approximately  in  2.7  times.  Parameter  e  grows  with  shortening  of  A.  It  is  the 
case  of  large  c  when  propagation  over  periodical  boundary  differs  from  propagation  in  periodically 
modulated  atmosphere.  Modulation  of  atmosphere  with  wavelength  far  less  than  the  period  T  of 
ray  oscillations  in  duct  have  no  large  effect  on  ducting.  For  values  £  <  1  behavior  of  rays  depends 
on  initial  condition.  In  Figure  6  and  Figure  7  are  represented  results  of  ray  tracing  for  s  ~  0.34. 
For  fdgure  6  phase  shift  was  F  =  7r/2,  for  Figure  7  phase  shift  F  =  0.  In  both  figures  phase 
space  is  subdivided  into  two  zones.  It  is  zones  of  stability,  where  ducting  for  a  long  distance  is 
possible.  Ray  corresponding  to  curve  1  in  Figure  6  propagates  in  duct  for  a  very  long  distance 
(more  than  1000km).  jumping  from  one  zone  into  another.  The  same  ray  in  Figure  7  leaves  duct 
after  3  reflections  from  boundary.  Another  ray  in  Figure  7  propagates  for  a  long  distance  in  the 
same  manner  as  first  ray  in  Figure  6.  So  for  values  of  5  near  or  less  than  1  the  behavior  of  rays 
is  very  sensitive  to  the  conditions  of  ray  reflection  from  the  earth.  Stability  zones  in  Figures  6.  7 
are  separated  b}'  the  gap.  It  is  zone  of  instabilit}-.  It  determines  the  range  of  elevated  angles  at 
the  reflection  point  for  given  phase  shift  F  where  ducting  is  impossible. 


7.  Summary. 

Propagation  of  F'HF/VTIF  waves  is  determined  for  many  cases  by  the  shape  of  terrain.  Distance 
of  radio  communication  and  characteristics  of  received  signals  depend  strongly  on  many  processes 
connected  with  reflection  from  different  areas  of  terrain,  diffraction  at  sharp  edges  and  scattering, 
produced  by  atmospheric  turbulent  fluctuations  [1].  [7].  For  ducting  conditions  the  distance  of 
radio  link  can  increase  significantly  but  the  structure  of  received  signals  can  be  very  complicated. 
Periodicity  in  boundary  conditions  causes  leakage  of  radio  waves  from  ducts  and  determines  the 
upper  limit  of  radio  link  distance.  From  other  side  periodicity  in  the  structure  of  duct  parameters 
leads  to  mixing  and  coupling  of  rays  and  is  responsible  for  additional  phase  delays  of  received 
signals.  These  effects  can  be  very  significant  for  assessment  of  the  field  strength.  When  maximal 
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inclinat  ion  angle  of  tangent  to  the  boundary  surface  is  more  than  critical  trapping  angle  the  duct¬ 
ing  in  general  situation  becomes  impossible.  Reflection  from  periodical  boundary  can  increase 
the  distance  of  radio  link  only  for  the  case  of  periodically  modulated  refractive  index  structure. 
After  reflection  the  rays  can  be  trapped  in  periodically  modulated  elevated  duct  and  propagate 
for  longer  distance  than  for  unmodulated  case  over  periodical  boundary. 

Acknowledgment.  The  author  is  indebted  to  V. A. Popov  for  his  numerical  algorithm  for  fit 
method. 
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Abstract 

Accurate  and  rapid  evaluation  of  radar  signature  for  alternative  aircraft/store  config¬ 
urations  would  be  of  substantial  benefit  in  the  evolution  of  integrated  designs  that  meet 
RCS  requirements  across  the  threat  spectrum.  Finite-volume  time  domain  methods  offer 
the  possibility  of  modeling  the  whole  aircraft,  including  penetrable  regions  and  stores,  at 
longer  wavelengths  on  today’s  supercomputers  and  at  typical  airborne  radar  wavelengths 
on  the  massively  parallel  teraflop  computers  of  tomorrow.  To  realize  this  potential,  practi¬ 
cal  means  are  being  developed  for  the  rapid  generation  of  grids  on  and  around  the  aircraft, 
and  numerical  algorithms  that  maintain  high  order  accuracy  on  such  grids  are  being  con¬ 
structed. 

A  structured  grid  and  an  unstructured  grid-based  finite-volume,  time-domain 
Maxwell’s  equation  solver  has  been  developed  incorporating  modeling  techniques  for  gen¬ 
eral  radar  absorbing  materials.  Using  this  work  as  a  base,  the  goal  of  the  CEM  effort  is 
to  define,  implement,  and  evaluate  rapid  prototype  signature  prediction,  addressing  many 
issues  related  to  1)  physics  of  electromagnetics,  2)  efficient  and  higher-order  accurate  al¬ 
gorithms,  3)  boundary  condition  procedures,  4)  geometry  and  gridding  (structured  and 
unstructured),  5)  computer  architecture  (SIMD  and  MMD),  and  6)  validation. 

Introduction 

The  ability  to  predict  radar  return  from  complex  structures  witli  laj'ered  material 
media  over  a  wide  frequency  range  (100  MHz  to  20  GHz)  is  a.  critical  technology  need 
for  the  development  of  stealth  aerospace  configurations.  Traditionally,  radar  cross  section 
(RCS)  calculations  have  employed  one  of  two  methods:  high  frequency  asymptotics,  which 
treats  scattering  and  diffraction  as  local  phenomena;  or  solution  of  an  integral  equation 
(in  the  frequency  domain)  for  radiating  sources  on  (or  inside)  the  scattering  bodj^,  which 
couples  all  parts  of  the  body  through  a  mvdtiple  scattering  process.  A  third  approach 
is  the  direct  integration  of  the  differential  or  integral  form  of  Maxwell’s  equations  in  the 
time-domain. 

The  time- domain  Maxwell’s  equations  represent  a  more  general  form  than  the 
frequency-domain  vector  Helmholtz  ec[uations,  which  are  usually  employed  in  solving  scat¬ 
tering  problems.  A  time-domain  approach  can,  for  instance,  handle  continuous  wave 
(single  frequency)  as  well  as  a  single  pulse  (broadband  frequency)  transient  response. 
Frcquencv-domain-  l)ased  methods  usually  provide  the  RCS  resf)onse  for  all  angles  of  inci¬ 
dence  at  a  single  freqvicncy,  while  time  -domain  based  methods  provide  solutions  for  many 
freciuencies  frmn  a  smgle  transient  calculation.  Also,  in  a  time-domain  approach,  one 
can  consider  time-varying  material  properties  for  treatment  of  active  surfaces.  By  using 
Fouricu-  transforms,  the  time-domain  transient  solutions  can  be  processed  to  provide  the 
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frcqiicncy  -cloiiiain  response.  Frequency  dependent  (dispersive  )  and  anisotropic  material 

I)roperties  can  also  be  included  within  the  time  domain  formulation. 

CEM  is  a.  critical  technology  in  the  advancement  of  future  aerosi)ace  development 
through  sujoercomputing.  As  we  transition  from  the  present  Gigaflops  to  the  next  gener¬ 
ation  Terafiops  com])uting,  CEM  will  become  integral  to  afnosi)a.ce  design  not  only  as  a 
stand  alone  technology  but  also  as  part  of  the  multidisciplinary  coupling  that  leads  to  W('ll 
optimized  designs. 

Objectives 

Toward  establishing  a  computational  environment  for  performing  multidisciplinary 
studies,  the  initial  goal  is  to  advance  the  state-of-the-art  in  CEM  with  the  following  specific 
ol:>jectives. 

1)  Apply  algorithmic  ach'aiices  in  Computational  Fluid  Dynamics  (CFD)  to  solve 
Maxwell’s  equations  in  general  form  to  study  scattering  (radar  cross  section),  ra¬ 
diation  (antenna),  and  a  variety  of  eletromagnetic  environmental  (electromagnetic 
compatibility,  shielding,  and  interference)  problems  of  interest  to  fjoth  the  defense 
and  commercial  community. 

2)  Establish  the  viability  of  MIMD  massively  parallel  architectures  for  tackling  large 
scale  problems  not  amenable  to  present  day  supercominiters. 

3)  Mature  the  CEM  technology  to  the  point  of  Ijeing  al)lc  to  perform  coupled  CFD/CEM 
optimization  design  studies. 

CEM  Issues 

Proper  development  of  a  CEM  capability  appropriate  for  all  aspects  of 
aerospace  design  must  consider  various  issues  associated  with  electromagnetics.  Some 
of  them  are; 

1)  Maxwell’s  Equations 

In  order  to  api)ly  conservation  principles  (for  example,  in  fluid  dynamics  mass,  momen¬ 
tum,  and  energy  ar('  conseiuaxl).  many  of  the  governing  equations  representing  appropriate 
physical  processes  are  written  in  conservation  form.  The  general  form  of  a  differential 
conservation  equation  can  be  written  as 


Qi  +  Er  +  Fy  +  =  Source  (1) 

where  Q  is  the  sohition  vector  and  E,  F,  and  G  are  the  flnxes  in  x,  y,  and  a  coordinate 
directions,  respectively,  The  conservation  form  readily  admits  weak  solutions  such  as  shock 
waves. 


The  integral  form  of  tlie  conservation  laws  which  can  easily  b('  derived  from  the  differ¬ 
ential  form  by  integrating  E<j.  (1)  with  respect  to  over  any  conservation  cell  whose 

vohniK'  is  V. 

^OQ  OE  OF  DG'^ 
dz 


dt  ^  dx  dy^ 


(lx  (ly  dz 


S  (lx  (ly  dz  =  S 
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This  can  be  rewritten  in  vector  notation  as 


Q  dx  dy  dz 


V  •  / )  dx  dy  dz  =  S 


In  the  above, 


jr  ^  E]  +  Fk  +  GI 


Applying  the  Gauss  divergence  theorem,  we  can  convert  the  volume  integral  into  a  surface 
integral. 

|(Qv)+//(#.b*  =  S  .  (5) 

In  the  above  equation,  the  cell  average  of  the  dependent  variables  are  denoted  by  Q.  The 
outward  unit  normal  at  any  point  of  the  boundary  surface  of  a  cell  has  been  denoted  by 
h  =  i'txj  +  fiyk  +  h~l. 

;,_.fnvQdy 


The  integral  form  of  the  conservation  laws  given  by  Eq.  (5)  defines  a  system  of  eciuations 
for  the  cell  averagt;  values  of  the  dependent  variables. 

Maxwell’s  equations  in  their  vector  form  are 

^  =  -V  X  (7) 


.  (8) 

dt 

The  divergence  conditions  V  D  =  p  and  V-5  =  0  are  derived  directly  from  Maxwell’s  equa¬ 
tions,  where  V •  .7  =  —  ff-  The  vector  quantities  S  —  and  H  ~  {'Hx-'Hy.'Hz)  are 

the  electric  and  magnetic  held  intensities,  D  =  (D^.,  DT)  is  the  electric  displacement, 
B  =  [Bx.By.Bz)  is  the  magnetic  induction,  and  J  =  {Jx.Jy.Jz)  is  the  current  density 
and  p  is  the  charge  density.  The  subscripts  x,y,z  in  the  vector  representation  of  E,  7i,  B, 
and  D  refer  to  components  in  respective  directions. 

Maxwell’s  equations  can  also  be  cast  in  integral  conservation  form  as 


n  X  E 
—h  X  H 


dS  =  0 


where  the  six  components  of  F  •  n  in  Eq.  (5)  are  (?i  x  E,  —  h  x  77). 

2.0  Finite- Volume  Treatment 

Tlie  major  feature  of  the  i)resent  discretization  approach  that  distinguishes  it  from 
other  finite- volume  and  finite difference  procedures  is  that  the  electric  and  magnetic  field 
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vinkiiowns  are  co-located  in  lioth  space  and  time,  rather  than  being  assigned  to  two  inter¬ 
penetrating  spatial  grids  and  separated  a  half-step  in  time.  These  field  unknowns  are  the 
volume  averages  of  E  and  H  within  each  cell  in  the  space  filling  grid. 

An  algorithm  that  maintains  second-order  accuracy  in  both  space  and  time  can  be 
constructed  as  follows  (advancing  from  time  level  ???  to  rn  -f  1}: 

=  {Qv:  f  " ■F{Q::")ds 

Jda 

K  =  A  /  vQTds  =  ~  f  "  (>>  X  X  [{o;;'  -  e"']  })ds 

to  Jon  to  Jd,y 

e:y +  (x  -  >■<.)  ■  av"  fw  >■  in  cdi « 

<0;;'+'  =  <«"'  - "  •  F  ds  . 

Here  we  have  written  Ma.xwell's  equations  symbolically  as 

^  +  V.F(e)  =  ()  ,  Q  =  (d,b)  . 

and  the  solution  of  the  Riemann  problem  just  inside  a  cell  interface  is  denoted  Q* 
(Ref.  1). 

3)  Geometry/Gridding 

Problems  in  CEM  involve  arbitrarily  shaped  three-dimensional  geora(?tries  tliat  need 
to  be  rcjrresented  properly  in  the  computer  simulation.  In  addition  to  the  external  shape, 
CEM  also  recpiires  modeling  the  interior  of  the  i)enetrable  structure.  Depending  on  the 
formulation  (differential  or  integral),  one  may  choose  either  a  structured  grid  or  an  un¬ 
structured  grid  setup, 

.Two  gridding  issue's  that  need  to  be  addressed  in  EM  computations  are:  1)  num¬ 
ber  of  grid  points  per  wavelength  to  properly  represent  the  fields  in  and  around  a  scat- 
terer;  and  2)  how  far  should  the  outer  boundary  be  jdaced  from  the  scattering  object 
to  adequately  simulate  the  nonreflecting  boundary  condition.  In  general,  the  number  of 
points/wavelength  is  not  determined  by  wavelength  alone,  and  involves  the  body  dimen¬ 
sions  (characteristic  body  size  with  respect  to  wavelength)  also.  The  outer  boundary 
location,  theoretically,  can  be  right  on  the  body  surface  itself;  however,  the  computational 
implementation  of  nonreflecting  boundary  conditions  requires  the  outer  Ixnindary  at  a  few 
(2  to  5)  wavelengths  away  from  the  surface.  Again,  if  one  can  construct  higher  order  ac¬ 
curate  implementations  of  nonreflecting  boundary  conditions,  the  outer  boundary  can  be 
brought  very  close  to  the  scattering  surface.  In  general,  the  necessary  grid  resolution  is 
provided  only  aroimd  and  near  the  body  surface.  Betwec'u  the  ])ody  and  the  outer  bound¬ 
ary.  the  mesh  is  allowed  to  stretch  resulting  in  very  crude  (3  to  5  points  ])er  wavelength) 
meshes  near  the  outer  boundary  regions. 
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The  free  space  wavelength  is  reduced  to  smaller  values  inside  a  material  (as  e  and  fi 
become  large,  the  speed  of  propagation,  c  =  ^,  goes  down,  causing  the  wavelength  to 
scale  accordingly).  Thus,  the  grid  resolution  must  take  into  account  material  properties 
to  adequately  resolve  the  fields  inside  material  zones. 

The  number  of  grid  points  per  wavelength  required  depends  on  the  order  of  accuracy 
of  the  numerical  scheme.  A  second-order  accurate  scheme  usually  reciuires  at  least  ten  grid 
points  per  local  wavelength.  One  may  be  able  to  use  a  higher  order  scheme  and  minimize 
the  number  of  grid  points.  However,  as  the  order  of  accuracy  goes  up,  the  scheme  will  also 
require  more  computations  per  grid  point,  which  may  offset  the  execution  savings  with 
fewer  grid  points. 

The  requirement  that  the  fields  are  resolved  accurately  with  proper  grid  resolution 
makes  GEM  problems  computationally  intensive,  requiring  large  scale  supercomputmg. 
For  example,  to  compute  the  radar  cross  section  of  a  typical  aircraft  at  1  GHz,  even  if  one 
used  10  grid  cells  per  wavelength,  it  will  require  tens  of  millions  of  grid  points. 

4.0)  Massively  Parallel  Computing 

4.1  Parallel  Implementation 

With  the  emergence  of  massively  parallel  computing  architectures  with  potential  for 
teraflops  perffirmance,  any  code  development  activity  must  effectively  utilize  the  computer 
architecture  in  achieving  the  proper  load  balance  with  minimum  internodal  data  commu¬ 
nication. 

The  structured  finite-volume  code  was  originally  developed  and  optimized  for  vector 
computer  architectures.  The  implementation  of  the  code  on  a  distributed  memory  parallel 
architecture  was  accomplished  by  re-using  much  of  the  original  vector  code.  Additional 
coding  was  added  for  handling  inter-processor  communication  and  other  functions  unique 
to  the  parallel  implementation. 

4.2  Parallelization  Strategy 

For  the  structured  formulation  of  the  finite-volume  code  the  computational  domain 
surrounding  the  target  geometry  is  composed  of  3-dimensional  C  sided  volumes  of  grid 
points  called  zones  or  blocks.  Each  side  or  face  of  a  zone  either  connects  to  another  zone  or 
has  a  boundary  condition  defined  on  that  face  (perfect  conducting  surface,  outer  boundary, 
etc.).  The  parallel  algorithm  takes  advantage  of  this  multi-zonal  gridding  capability  in 
order  to  divide  work  among  processors.  The  various  zones  are  grouped  onto  processors, 
with  each  processor  obtaining  a  solution  for  the  cells  within  its  own  local  set  of  zones. 

4.3  Communication  Requirements 

The  solution  procedure  does  not  allow  for  processors  to  pioceed  completely  asyn 
chronously.  Solving  for  cells  on  zone  faces  that  are  connected  to  other  zones  requires 
information  from  within  the  adjacent  zone.  This  information  may  be  available  locally  if 
the  adjacent  zone  resides  on  the  same  processor,  or  message  passing  may  be  required  if 
the  adjacent  zone  resides  on  another  processor.  This  boundary  update  message  passing  or 
flux  transfer  message  passing  is  done  twice  per  solution  time  step  and  forms  the  l^ulk  of 
the  parallel  code’s  message  passing  requirements. 
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4.4  Load  Balancing 

Load  balancing  is  achifn'cd  l^y  mapping  zones  onto  processors.  Perfect  load  balancing 
requires  that  each  processor  have  the  same  nnrnber  of  zones,  each  containing  the  same 
number  of  grid  points  and  equal  numbers  and  types  of  boundary  condition  cells.  Simple 
geometries  may  ustially  be  zoned  in  such  a  manner  as  to  obtain  perfect  load  balancing. 
For  complex  geometries  perfect  knid  balancing  is  much  more  difficidt,  but  adequate  load 
balancing  may  usually  be  obtained  by  mapping  a  close  to  equal  number  of  grid  points  onto 
each  ])rocessor. 

4.0  Scalability  Resrdts 

Validation  and  timing  studies  have  been  performed  on  a.  512-nodc  nCUBE  and  a 
20S-node  Intel  Paragon.  Currently  the  code  shows  good  scalability  on  evenly  balanced 
test  cases.  These  cases  typically  had  simple  gridding  requirements  and  a  straight  forward 
domain  decomposition.  The  results  show  that  inter-processor  communication  due  to  flux 
transfer  never  becomes  a  dominant  time  factor  even  on  problems  with  large  numbers  of 
grid  points  run  on  many  processors.  The  sphere  test  case  illustrated  in  Fig.  1  shows 
how  problem  size  and  number  of  processors  can  be  increased  while  solution  time  remains 
level.  Perfectly  conducting  sphere  grids  were  run  on  6,  24,  and  96  processors  of  the  Intel 
Paragon.  The  number  of  grid  points  per  processor  remained  constant  at  ap])roximately 
60,000  resulting  in  total  grid  sizes  of  approximately  0.35,  1.4,  and  5.7  million  grid  points  for 
the  three  cases.  Since  increasing  the  number  of  processors  results  in  an  increased  number 
of  zonal  interfaces,  flux  message  passing  requirements  increase  throughout  the  system. 
Despite  this  increase  in  required  message  passing,  communication  times  did  not  change 
appreciably. 

Complex  problems  such  as  full  scale  fighter  geometries  also  show  encouraging  results. 
Figure  2  shows  timing  results  and  zoning  for  the  VFY218  fighter  gridded  for  a  frequency 
of  500MHz  with  a  10  point  per  wavelength  resolution.  A  total  of  58  zones  and  2.2  million 
grid  points  were  required.  The  grid  was  run  on  an  Intel  Paragon  using  28,  61,  and  128 
processors.  Preliminary  timing  data  reveals  that  communication  overhead  remains  at 
between  1  and  2.5  percent  of  the  total  solve  time  and  that  solution  speedup  occurs  as  the 
problem  is  distributed  over  more  processors.  Speedup  may  be  improved  l)y  addressing  load 
balancing  issues  arising  from  complex  zoning  arrangements. 
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Parallel  Solutions  of  Maxwell’s  Equations  on  the  Meiko  CS-2 

Niel  Madsen,  Bill  Erne,  David  Steich,  Grant  Cook 

Lawrence  Livermore  National  Laboratory 
Livermore,  CA  94550 


Abstract 

The  efficient  numerical  solution  of  Maxwell's  equations  in  the  time-domain  on  parallel  computers  is  a 
non-trivial  task.  It  is  even  more  challenging  when  the  numerical  solution  method  involves  the  use  ot 
unstructured  non-orthogonal  and  multi-element  type  grids. 

This  paper  describes  some  efforts  and  experiences  in  utilizing  distributed  memoiy 
like  the  Meiko  CS-2  together  with  the  DSI3D  algorithm.  Grid  generation  remains  the  most  P^ 

of  accurate  numerical  EM  simulation.  However,  this  process  will  not  be  discussed  as  it  is  largely 
independent  of  the  parallel  computation  issues. 

The  parallel  solution  paradigm  chosen  is  that  a  single  large  problem  will  be  solved  through  use  of 
multiple  processors.  This  is  in  contrast  to  the  "embarrassingly  parallel  approach  of  solving  rnany 
independent  problems  simultaneously  (one  per  processor).  The  choice  of  this  paradi^  nec^simtes  the 
partitioning  of  the  single  large  problem  among  the  available  processors.  In  order  for  this 
two  conditions  must  be  met:  1)  each  processor  should  have  an  equal  workload, 
inter-prt)cessor  communication  should  be  minimized.  Several  approaches  are  presented  and  discussed. 

Once  the  problem  has  been  partitioned,  the  inter-processor  communication  issues  must  be  addresse^It 

is  assumJd  that  at  problem  startup  time,  a  given  processor  knows  only  about  its 

and  knows  nothing  about  any  of  the  other  processors’  data.  A  definitive  ^ 

use  to  discover  its  neighbors  and  related  required  communication  is  presented.  This,  ^f^se,  requires 

significant  inter-processor  communication.  As  some  variables  which  exist  on  8  P  , 

boundaries  may  be  shared  between  two  or  more  prc^essors,  questions  of  ownership  arise  and  simple 

techniques  for  resolving  asynchronous  communication  conflicts  are  established. 

As  some  of  the  newer  parallel  computer  processors  are  capable  of  processing 

efficiently,  performance  optimization  through  vector  processing  meth^s  are 

improvernents  of  factors  up  to  20  have  been  noted.  Specific  performance  figures  for  the  Meiko  CS-2 

are  presented. 

Overall  performance  speedup  data  will  be  presented  for  the  geometry 

stepping  processor.  High  efficiencies  are  achieved  provided  that  there  is  a  sufficient  workload  (problem 
size)  maintained  on  each  processor. 

Finally,  performance  issues  related  to  outer  absorbing  radiation  boundary  conditions  and  near-to-far 
field  transformations  are  discussed. 
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Parallelization  of  the  CARLOS-3D  Method  of  Moments  Code 


J.M.  Putnam  and  D.D.  Car 
McDonnell  Douglas  Corporation 
J.D.  Kotulski 

Sandia  National  Laboratories 


Abstract 

Flat  triangular  patches  and  linear  roof-top  basis  functions  (S.  M.  Rao,  D.  R.  Wilton,  and 
A.  W.  Glisson,  IEEE  AP  Trans.,  30,  409-418,  May  1982)  have  been  used  extensively  over  the  past 
few  years  to  model  the  scattering  and  radiation  from  complex  three-dimensional  objects.  CARLOS-3D 
is  a  three-dimensional  scattering  code  based  on  a  Galerkin  method  of  moments  formulation  employing 
roof-top  basis  functions  to  model  fully  arbitrary  geometries  composed  of  multiple  conducting  and 
homogeneous  dielectric  regions.  The  code  was  developed  under  the  sponsorship  of  the 
Electromagnetic  Code  Consortium  (EMCC)  and  is  available  to  qualified  users.  Various  boundary 
conditions  including  conducting,  dielectric,  resistive,  and  impedance  can  be  specified  on  selected 
surfaces  composing  the  target.  Current  continuity  between  connected  surfaces  is  rigorously  enforced 
using  a  general  indexing  scheme  in  the  code  to  implement  all  of  the  boundary  and  junction  conditions. 
The  code  is  based  on  a  Galerkin  matrix  operator  notation  which  makes  the  main  structure  of  the  code 
independent  of  the  geometry  representation  and  basis  functions  used.  Matrix  symmetry  is  exploited 
during  the  system  matrix  fill  procedure  to  reduce  run  times,  and  the  symmetric  system  matrix  is  stored 
in  packed  form  to  reduce  memory  requirements.  Geometric  symmetry  is  also  used  to  further  reduce 
both  the  computational  and  memory  requirements  for  large  symmetric  targets.  The  code  has  been 
validated  against  measured  data  and  other  codes  for  a  large  number  of  targets  and  is  currently  used  by 
over  50  aerospace  companies  and  government  agencies. 

This  paper  describes  how  the  serial  code  was  ported  to  the  Intel  Paragon.  Details  are  given  outlining 
the  key  steps  in  the  parallelization  process,  along  with  special  features  in  the  code  which  facilitated 
the  effort.  The  focus  will  be  primarily  on  the  parallel  implementation  of  the  matrix  fill,  right-hand-side 
fill,  and  the  far-field  computation.  The  solution  for  the  current  coefficients  relies  upon  existing  parallel 
solver  packages.  The  code  has  been  adapted  to  both  the  Intel  Pro-Solver-DES  package  for  out-of-core 
solutions  and  an  in-core  solver  developed  at  Sandia. 

Results  are  presented  showing  the  performance  of  the  code  for  some  large  scattering  problems.  Scaling 
of  the  computational  resources  with  problem  size  is  also  addressed,  along  with  other  issues  related  to 
the  parallel  implementation. 

Introduction 

CARLOS-3D  is  a  general-purpose  method  of  moments  (MM)  code  for  computing  the  scattering  from 
complex  three-dimensional  targets.  It  is  based  on  a  McDonnell  Douglas  Aerospace  proprietary  code 
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CARLOS,  which  models  antenna  and  scattering  problems  for  2D  and  3D  geometne^ 

was  developed  under  the  sponsorship  of  the  Electromagnetics  Code  Consotlium  (^MCC)  and 's 

available  to  qualified  organizations  through  them,  subject  to  US  export  control  laws.  The  code  u 

the  MM  technique,  with  Galerkin  testing  to  solve  the  Stratton-Chu  surface 

user  specified  geometry.  All  of  the  surfaces  describing  the  scatterer,  consisting 

and  bLndaries  between  different  dielectric  regions  are  replaced  with  equivalent  electric  (J)  and 

magnetic  (M)  currents.  The  code  solves  for  these  induced  equivalent  currents,  which  are  then  used  to 

compute  the  scattered  far-fields. 

Shtr^^oS^s  are  modeled  using  flat  triangular  facets,  with  the  electric  and  magnetic  currents 
expanded  in  terms  of  the  Rao,  Wilton,  Glisson  roof-top  functions  [1]_  A 

two  facets  forming  each  interior  edge  of  a  surface.  At  junction  edges,  which  are  by  he 

intersection  of  two  or  more  surfaces,  half  roof-top  expansion  functions  are  used  to  *e 

currents  Current  continuity  across  a  junction  is  enforced  by  equating  the  unknowns  assoc  atedwrih 
each  of  the  half  roof-top  functions  for  the  edge.  Junction  edges  are  determined  automatically  for 
surfaces  which  have  common  node  points  along  a  line  of  intersection  between  surfaces. 

Comnlex  geometries  which  are  composed  of  multiple  conducting  and  bulk  dielectric  regions  can  be 
modeled  Variots  boundary  conditioL  can  be  imposed  separately  on  each  of  the  surfaces  compnsrng 
r  target  For  a  given  geometry,  each  dielectric  region  is  given  a  number  and  every  surface  wh.ch 
forms  f  boundary  between  different  regions  must  be  entered  as  a  faceted  surface.  Infinitesimally-thi 
conducting  resistive,  and  impedance  sheets  must  also  be  defined.  The  user  must  specify  both  the 
interior  and  exterior  regions  for  each  surface,  along  with  the  boundary  condition  to 
and  magnetically-conducting  boundaries  can  be  either  embedded  in  a  region  or  be  defined  on  the 
interfact  between  different  regions.  Tapered  resistive  and  impedance  surfaces  are  ^ 

specifying  values  for  each  facet  forming  the  surface.  Impedance  (Leontovich)  boundary  conditions 
modeled  as  an  equivalent  combination  resistive/magnetically-conducting  boundary. 

The  Stratton-Chu  integral  equations  can  be  solved  using  several 

surfaces,  either  the  electric  field  integral  equation  (EFIE),  the  magnetic  fie  d  integral 

or  the  combined  field  integral  equation  (CFIE)  can  be  used.  The  coupling  parameter  n  the  CFffi 

formulation  is  specified  separately  on  each  surface,  allowing  the  formulation  to  be  used  for  geometnes 

with  both  open  md  closed  surfaces.  For  dielectric  boundaries,  the  PMCHW  f°rmu‘ahOT 

Miller,  Chu,  Harrington,  and  Wu)  is  used.  Galerkin  teshng,  in 

expansions  results  in  a  symmetric  system  of  equations  whenever  the  PMCFIW/EFIE  formulat  o 
used  In  addition,  the  forLlations  implemented  for  treated  surfaces  also  result  in  symmetric  matnces 
which  require  only  half  of  the  matrix  elements  to  be  computed  and  stored. 

Scattering  from  apertures,  cavities,  and  gaps  in  a  conducting  surface  can  be  modeled  using  an  infinite 
ground  pfane  option.  Image  theory  is  used  to  model  both  the  source  and  the  induced  currents  o"  all  of 
fhe  surfLes  which  are  either  conducting,  dielectric  or  treated.  The  resulting  system  of  equations  is 

again  symmetric. 
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Features 

The  CARL0S-3D  code  has  a  flexible  and  modular  structure  which  facilitates  the  incorporation  of  new 
features.  The  major  components  of  the  code  are  independent  of  the  surface  representation  (i.e.,  flat 
facets,  quadrilaterals,  curved  surfaces,  etc.),  and  of  the  basis  functions  which  are  used  to  approximate 
the  surface  currents,  permitting  the  code  to  be  easily  adapted  to  advanced  basis  functions.  The  section 
of  code  which  generates  the  system  matrix  for  an  arbitrary  geometry  is  written  in  terms  of  a 
generalized  Galerkin  matrix  operator  notation  [2].  These  operators  result  from  testing  either  the 
integral  operators  in  the  Stratton-Chu  equations,  the  equivalent  currents  directly,  or  the  incident  fields. 
The  subroutines  which  assemble  the  system  matrix  and  right-hand-side  vector  refer  to  these  generic 
operators.  The  generic  operators  then  reference  routines  which  are  specifically  written  for  a  given 
basis  function  type.  Only  the  geometry  input  routine,  and  these  specialized  Galerkin  operator 
routines  depend  upon  the  surface  representation  and  basis  functions  used.  Symmetry  relations,  which 
can  be  established  for  the  Galerkin  matrix  operators,  are  used  to  efficiently  fill  the  system  matrix  by 
eliminating  redundant  calculations. 

A  systematic  approach  [3]  is  used  to  generate  the  matrix  equation,  ZI=V,  for  an  arbitrary  geometry 
with  various  boundary  conditions  imposed  on  the  surfaces.  This  approach  is  based  on  a  simple 
indexing  scheme,  which  assigns  an  index  number  to  each  edge  (roof-top  or  half  roof-top  function) 
defining  the  entire  geometry.  The  index  number  for  a  particular  edge  specifies  the  location  within  the 
column  vector,  I,  of  either  the  J  or  M  current  coefficient  associated  with  that  edge.  This  indexing  is 
performed  in  the  geometry  input  routine,  and  is  based  on  the  boundary  condition  which  is  imposed  on 
each  surface.  The  boundary  condition  is  used  to  define  the  equivalent  currents  which  reside  on  the 
surface,  and  the  relationship  between  the  interior  and  the  exterior  current  coefficients.  Current 
continuity  across  junction  edges  connecting  separate  surfaces  is  enforced  by  equating  the  indices 
associated  with  the  half  roof-top  functions  for  the  edge.  The  matrix  assembly  routine  is  based  on  the 
index  associated  with  each  basis  function  (edge),  and  not  on  the  explicit  form  of  the  basis  function  or 
the  surface  representation.  The  matrix  assembly  routine  in  CARLOS-3D  has  been  used  to  model  2D, 
body-of-revolution  (BOR),  and  wire  geometries  with  overlapping  triangle  function  expansions.  It  has 
also  been  used  for  3D  geometries  modeled  with  quadrilateral  patches  and  higher  order  parametric  basis 
functions. 

CARLOS-3D  can  efficiently  model  arbitrary  geometries  with  right/left  and/or  top/bottom  symmetry, 
using  only  a  half  or  quarter  of  the  geometry.  The  Galerkin  matrix  operators  defining  the  interactions 
between  symmetric  parts  of  the  target  are  related,  allowing  the  matrix  equation  to  be  decoupled  into 
either  two  of  four  smaller  systems.  Each  of  the  smaller  systems  of  equations  contains  either 
approximately  a  half  or  quarter  of  the  unknowns  from  the  original  system,  and  this  decoupling  is 
independent  of  the  source  excitation.  These  decoupled  equations  are  filled  and  solved  sequentially, 
reducing  the  memory  required  to  store  the  matrix  by  a  factor  of  either  four  or  sixteen.  By  filling  each 
system  separately,  memory  requirements  are  reduced,  but  the  matrix  fill  time  is  equivalent  to  that  of 
the  original  system  assuming  no  symmetry. 

Matrix  elements  in  CARLOS-3D  are  computed  using  a  combination  of  analytic  and  numerical 
procedures.  The  algorithm  is  facet-based  in  order  to  eliminate  redundant  calculations.  The  matrix 
elements  are  evaluated  using  adjustable  quadrature  formulas  to  compute  the  double  surface  integrals 
over  pairs  of  triangular  facets.  Normalized  area  coordinates,  for  flat  triangular  regions,  are  used  to 
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compute  generic  double  surface  integrals  between  a  pair  of  triangles  which  are  then  used  to  compute 
the  interactions  between  all  edge  combinations.  Self-term  computations  are  based  on  an  analytic 
procedure  which  reduces  the  four-fold  integral  to  a  double  integral  which  is  integrated  numerically, 
with  the  singular  part  evaluated  analytically.  Near  terms  are  evaluated  using  standard  singulanty- 
extraction  methods,  with  the  adjustable  quadratures  depending  upon  the  test  and  source  tacet 
separation  distance.  This  allows  the  matrix  fill  procedure  to  be  optimized  for  both  speed  and 


accuracy. 


The  parallelization  of  CARLOS-3D  was  facilitated  by  an  option  in  the  code  which  can  be  used  to 
specify  an  arbitrary  sub-block  of  the  Z  matrix  to  fill.  This  block-fill  option  is  implemented  in  a 
manner  which  minimizes  the  number  of  redundant  matrix  element  evaluations,  and  only  requires 
memory  to  store  the  sub-block  being  generated.  The  entire  system  matrix  can  be  generated  by 
partitioning  the  matrix  into  sub-blocks,  and  then  filling  each  sub-block  separately,  with  only  a  modest 
increase  in  execution  time.  This  block-fill  option  has  been  used  to  adapt  the  code  to  out-of-core  solver 
packages  for  solving  large  problems  on  a  workstation.  It  is  also  the  key  feature  which  is  used  in  the 
parallelization  of  the  code  which  is  described  below. 


CARLOS-3D  has  been  extensively  validated  against  both  measured  data  and  other  numerical  methods. 
Conducting  and  coated  sphere  results  have  been  compared  with  Mie  series  solutions.  Flat  plate 
calculations  have  been  compared  with  the  measured  data  for  all  five  of  the  EMCC  benchmark  targets 
[4],  and  for  some  large  kite-shaped  plates  at  MDA.  Typically,  converged  results  are 
to  100  triangular  facets  per  square  wavelength  of  surface  area.  Addriionally,  the  M^OR  code 
CICERO  15]  has  been  used  to  validate  the  modeling  of  junctions  and  dielectric  materials  for  circu  ar 
cylinders  and  cones.  A  circular  cylinder  which  is  half  conducting  and  half  Plexiglas  [4,  6]  was  used  to 
validate  the  modeling  of  junctions  in  CARLOS-3D.  This  case  validates  the  junction  modeling  of  the 
edges  which  form  the  intersection  between  the  three  surfaces  (i.e.,  outer  conductor,  dielectnc,  and  inner 
conductor).  Other  test  geometries,  including  an  air-coated  plate  and  a  conducting  cube  witheither  one 
or  two  attached  air  cubes  were  also  used  to  ensure  that  Junctions  between  intersecting  surfaces  with 
different  boundary  conditions  are  modeled  correctly. 


The  formulations  contained  in  CARLOS-3D  for  resistive,  magnetically-conducting,  and  impedance 
surfaces  have  been  tested  against  the  results  from  other  codes,  and  for  special  li^miting  cases.  These 
limiting  cases  have  been  used  extensively  as  a  way  for  checking  for  self-consistency  of  the 
formulations  As  an  example,  the  boundary  conditions  at  a  dielectric  boundary  can  be  modeled  as  a 
resistive  boundary  between  the  two  dielectrics,  as  the  resistance  becomes  large.  Similarly,  as  the 
resistance  approaches  zero,  the  surface  becomes  conducting,  and  the  magnetic  currents  and  i^tenor 
electric  currents  should  vanish.  Both  of  these  cases  take  a  simple  boundary  condition  and  rnodel  i  as 
the  limit  of  a  more  complicated  boundary  condition  involving  additional  unknowns.  Additionally, 
thin-sheet  formulations  have  been  shown  to  be  consistent  with  the  equivalent  formulation  involving 
two  distinct  regions  with  the  same  material  properties. 


The  parallel  version  of  C  ARLOS-3D  is  structured  so  that  it  can  be  run  either  on  a  workstation,  or  on 
an  Intel  Paragon  machine.  All  of  the  parallel  logic  is  embedded  in  just  a  few  routines  which  are  only 
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executed  on  the  parallel  machine.  The  workstation  version  of  the  code  can  be  easily  ported  to  the  Intel 
by  following  a  list  of  conversion  steps.  This  allows  most  of  the  code  development  to  be  performed  on 
a  workstation,  with  final  testing  done  on  the  parallel  machine.  Also,  the  input  data  for  both  the  serial 
and  parallel  versions  is  identical.  In  the  parallel  version,  all  of  the  input  geometry  data  is  read  and 
processed  on  a  single  node  (node  zero).  The  geometry  data  is  then  sent  to  all  of  the  other  nodes. 
Parallel  logic  is  used  to  generate  the  MM  system  matrix  and  right-hand-side  (RHS)  excitation  vectors 
on  all  of  the  nodes.  A  parallel  solver  is  then  called  to  perform  the  matrix  factorization  and  solution  of 
the  surface  current  coefficients.  The  scattered  far-fields  are  computed  in  parallel,  with  the  final  results 
sent  to  node  zero  for  output. 

The  strategy  in  the  parallelization  effort  was  to  rely  on  existing  parallel  solver  technology  for  the 
Paragon,  and  to  simply  adapt  the  code  to  the  solver  software.  The  conversion  was  basically 
performed  in  three  steps.  The  first  step  was  actually  done  for  the  purpose  of  implementing  an  out-of- 
core  solution  package  into  the  serial  code,  and  is  now  used  extensively  in  the  parallel  code.  The  matrix 
fill  procedure  was  modified  so  that  an  arbitrary  sub-block  of  the  MM  system  matrix  could  be 
generated.  Logic  was  included  to  skip  over  any  matrix  element  calculations  which  do  not  contribute  to 
the  selected  sub-block.  Since  each  triangular  facet  can  have  three  unknowns  associated  with  it,  the 
ordering  of  the  facets  and  edges  (unknowns)  can  adversely  affect  the  efficiency  of  the  block-fill 
procedure.  A  Reverse  Cuthill-McKee  algorithm  was  included  to  reorder  the  facets  and  edges.  Second, 
the  serial  code  was  moved  to  the  parallel  machine,  and  logic  was  included  to  perform  the  reading  and 
sending  of  the  geometry  data.  Finally,  specific  routines  were  written  which  interface  CARLOS-3D  to 
each  solver  package  which  is  used.  The  parallel  solver-specific  routines  generate  the  matrix  and 
excitation  vectors,  call  the  solver,  and  compute  the  far-fields.  Therefore,  the  code  can  be  easily 
interfaced  to  other  future  solver  packages  by  simply  writing  a  new  routine  specific  to  that  solver.  For 
this  work,  we  interfaced  CARLOS-3D  to  both  the  Intel  Paragon  ProSolver-DES  package  for  large  out- 
of-core  solutions,  and  to  an  in-core  solver  developed  at  Sandia  National  Laboratories. 

For  the  ProSolver-DES  interface,  the  MM  system  matrix  is  partitioned  into  sub-blocks.  Care  is  taken 
to  balance  the  workload  between  the  nodes  during  the  generation  of  the  RHS  vectors  and  the  matrix 
sub-blocks.  Since  the  RHS  vector  generation  is  much  faster  than  the  matrix  generation,  only  selected 
nodes  are  used  to  perform  the  former  task  in  order  to  reduce  contention  for  the  I/O  channels  to  the 
out-of-core  file  system.  Additionally,  for  symmetric  targets,  disk  space  is  reused  for  storage  of  the 
system  matrix  and  RHS  files  during  the  solution  of  the  decoupled  systems  of  equations. 

For  the  Sandia  solver  interface,  the  matrix  is  partitioned  so  that  each  node  is  responsible  for  both  a 
part  of  the  system  matrix  and  matrix  of  RHS  vectors.  The  block-matrix  fill  option  is  used  to  generate 
the  part  of  the  matrix  for  which  the  node  is  responsible.  The  appropriate  RHS  vectors  are  also 
computed  for  each  node,  and  then  the  solver  is  called.  The  parts  of  the  permuted  solution  vectors 
which  reside  on  each  node  are  then  used  to  compute  the  far-fields. 

Results  and  Performance 

The  computation  of  the  RCS  for  the  VFY  218  aircraft  and  a  one  meter  almond  will  now  be  discussed. 
These  problems  were  run  on  the  Sandia  Paragon  installation  which  has  a  normal  configuration  of  1840 
compute  nodes  that  are  distributed  in  a  16x115  mesh.  CARLOS-3D  was  interfaced  to  the  Sandia  in- 
core  solver  and  the  results  are  collected  below. 
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The  first  problem  geometry  is  the  VFY  218  aircraft.  This  model  contains  38,922  tnangular  patches 
and  19,840  nodes,  and  at  300  MHz  the  spatial  resolution  is  204  facets  per 

one  symmetry  plane  two  systems  are  solved,  one  with  29,381  and  the  second  wth  29,002  e  mn 
times  for  this  problem  with  1  and  181  RHS  vectors  are  given  m  Table  1  and  the  monostatic  RCS 

shown  in  Figure  1 . 


Table  1:  VFY  218  Job  Statistics 


Operation 

Tinie(sec)  (1  RHS) 

Time(sec)  (181  RHS) 

^  - -  - - 

I/O  and  Pre-processing 

34,9 

34.9 

matrix  assembly 

846,0 

846,0 

svstem  solve 

965.0 

1254.0 

post-processing 

20.1 

20.1 

Total 

1866.0 

2155.0 

It  is  pointed  out  that  the  solver  performance  for  1  RHS  is  137  Gflops/s.  This  was  obtained  by 
optimizing  the  solver  to  the  machine  topology,  including  the  additional  I/O  nodes  on  the  machine,  a 
using  the  second  processor  on  each  node  for  computation.  Normally,  it  is  used  for  communication, 
The  effect  on  performance  of  mapping  the  matrix  to  a  square  mesh  is  also  given  in  Table  2. 
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Table  2:  Solver  Performance,  29381  unknovyns 


Paragon  Mesh 
(matrix  map) 

RHS 

Gflops/s 

43  X  44 

1 

75.19 

37.9 

16x  115 

181 

637.3 

106.1 

57.7 

16x  119 

1 

490.8 

137.8 

72.4 

(*)  only  one  processor  used  for  computation 


The  performance  in  Table  2  shows  the  effect  of  multiple  RHS  vectors  using  only  one  processor  for 
computation  and  using  a  matrix  mapping  that  does  not  match  the  machine  topology. 

The  next  problem  considered  is  the  1  meter  almond  at  6  GHz.  This  model  contains  two  planes  of 
symmetry  resulting  in  solving  four  systems  of  equations.  These  consist  of  9984,  9828,  9828,  and 
9672  unknowns.  The  solver  performance  is  shown  for  this  problem  in  Table  3  for  9984  unknowns 
and  1 8 1  RHS  vectors.  The  monostatic  RCS  is  shown  in  Figure  2. 


Table  3:  Almond  with  9984  unknowns  and  181  RHS 


Paragon  Mesh 
(matrix  map) 

Solve  Time 
(sec) 

Gflops/s 

Mfl  ops/s 
(node) 

16x  16 

273.1 

9.7 

37.9 

16  x  32 

163.9 

16.2 

31.6 

16x  115 

57.2 

49.4 

26.8 

Figure  2.  RCS  versus  angle  for  1  meter  almond  at  6  GHz. 
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The  overall  timing  for  the  almond  is  collected  in  Table  4.  The  total  time  also  includes  the  pre-and 
post-  processing  that  is  necessary  for  the  geometry  and  RCS  calculation. 


Table  4;  Overall  timing  for  almond 


Paragon  Mesh 
(matrix  map) 

Matrix  Fill 
(sec) 

Solve  Time 
(sec( 

Total  Time 
(sec) 

16x  16 

1216.6 

1004. 

2642.4 

16  X  32 

734.4 

633.6 

1634.4 

16x  115 

397,8 

222.7 

766.8 

The  process  of  parallelizing  a  complex  applications-oriented  method-of-moments  code  CARLOS-3D 
has  been  described.  This  effort  has  resulted  in  a  code  which  has  the  flexibility  to  mn  on  both 
workstations,  Intel  iPSC/860  machines,  and  Paragon  machines.  The  result  is  a  code  which  is  more 
easily  extended,  and  permits  most  new  features  to  be  validated  independent  ^  ® 

implLentation.  The  parallel  implementation  relies  heavily  upon  features  which  were  validated  in  the 
serL  version.  In  addition,  the  structure  of  the  code  allows  for  the  easy  incorporation  of  advanced 
solvers  and  modeling  techniques  in  the  future. 
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Parallel  computing  for  electromagnetism  at  ONERA 

A.  de  La  Bourdonnaye,  A.  Cosnuau,  X.  Ferrieres,  P.  Leca  and  F.-X.  Roux 
ONERA,  Division  Calcul  Parallele,  Chatillon,  France 
Abstract.  In  this  paper,  on  one  hand,  we  present  the  development  of  a  coupled  volumic-surfacic  finite  element 
code  for  R.C.S.  computations  in  frequency  domain. 

We  liave  realized  a  parallel  solver  for  the  surfacic  finite  element  part  in  the  case  of  axi-.symmetric  bodies  with 
multi-level  meshes  which  avoid  singular  elements  at  the  poles. 

Yet,  in  the  case  of  one  level  mesh,  this  solver  reaches  a  performance  of  3  Gflops  on  an  128-processor  iPSC- 
860.  Furthermore,  we  have  studied  parallel  algorithms  for  the  coupled  solver  which  rely  on  an  adaption  of  a 
substnicturiug  method  which  is  used  for  elliptic  problems. 

On  the  other  hand  we  present  the  parallelization  of  a  FDTD  code  ba.sed  on  a  finite  difference  scheme  in 
collaboration  with  the  Department  of  Physics  of  ONERA.  The  parallelization  is  made  through  partitioning  the 
grid  into  subdomains.  The  interface  coherency  is  managed  through  message  passing  between  processors  allocated 
to  neighboring  subdomains.  This  approach  has  been  demonstrated  to  be  very  efficient  and  scalable  for  distributed 
memory  machines.  The  performance  of  this  code  for  real  industrial  applications  is  700  Mflops  on  a  64-proccssor 
Paragon. 


1.  Introduction.  This  paper  mainly  addresses  two  topics.  The  first  one  is  the  description 
of  a  coupled  volumic-surfacic  method  for  R.C.S.  computations  and  the  state  of  its  implementation 
on  a  distributed  memory  iNTEL  computer.  The  second  topic  is  the  parallelization  of  a  FDTD 
code  in  collaboration  with  the  Department  of  Physics. 

In  the  first  topic  we  aim  at  solving  a  frequency  domain  scattering  problem  with  an  heteroge¬ 
neous  spatially  bounded  obstacle.  We  choose  to  warp  this  body  into  a  surface  on  which  we  will  use 
exact  (and  thus  non  local)  boundary  conditions.  It  will  allow  us  to  use  a  volumic  finite  element 
method  inside  the  surface  and  a  discretized  integral  equation  on  the  surface.  Such  mathematical 
formulations  have  yet  been  studied  in  many  places,  for  instance  in  [4j,  [8],  [9]  or  [6].  We  make 
the  choice  of  an  axisymmetric  warping  surface  as  in  [5],  since  it  allows  us  to  fasten  part  of  the 
computation  and  to  save  memory.  Here  we  focus  on  two  points,  first  the  solution  algorithm  of  the 
coupled  problem,  which  relies  on  a  subdomain  point  of  view  and  second,  the  actual  implemen¬ 
tation  of  the  surfacic  solver  for  the  axisymmetric  case.  In  this  paper,  we  will  first  describe  the 
physical  situation  we  address,  set  some  notations,  and  recall  which  mathematical  formulation  we 
use.  After  that  we  will  present  and  analyze  a  solution  scheme  for  the  coupled  system.  Finally  we 
will  focus  on  the  numerical  solution  of  the  integral  equation  part  of  the  scheme. 

For  the  second  topic,  we  will  briefly  settle  the  background  and  present  some  computer  per¬ 
formances. 

2.  Frequency  Domain  computations. 

2.1.  Position  of  the  problem.  Let  be  a  bounded  domain  of  and  F  its  boundary 
which  is  supposed  to  be  regular.  Let  e{x),/i(a:)  be  the  relative  electric  permitivity  and  magnetic 
permeabilty.  They  are  supposed  to  be  at  least  piecewise  continuous.  Moreover,  we  assume  that 
outside  D,  e  =  =  1.  We  want  to  find  E,H  the  electromagnetic  field  satisfying  the  Maxwell 
system  : 

(1)  iuiB  ~  -rotE  +  J 

(2)  iioD  =  rotff 
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(3)  divD  =  p 

^4)  divB  =  0 

and  an  outgoing  wave  condition  where 

(5)  D  =  e{x)E 

(6)  B  =  p{x)H 

and  J  is  a  source  of  electric  current  and  p  is  the  corresponding  charge  density. 

2.2.  Mathematical  formulation  of  the  coupled  problem.  For  all  that  part,  we  will  use 
the  work  of  M.  Cessenat  [2],  V.  Levillain  [8]  and  our  own  [6],  as  a  mathematical  background  (see 
also  Colton,  Kress,  [3]).  For  the  sake  of  clearness  in  the  formulas,  we  first  define  some  operators. 

(7)  A  :  //.'lAr) ffiC(r) 

(8)  j j^V,G(\x-y\)/\idy 

(9)  and 

(10)  P2  :  - /f,-^(r) 

(11)  j-^n^  G{\x  -  yj)  A  jdy  +  ^  G{\x  -  y\)divvjdy^ 

The  Sobolev  spaces  need  not  to  be  defined  here  since  we  shall  not  speak  about  regularity. 
We  also  denote  F  =  Then  following  [8]  in  the  use  of  integral  representation  of  electric  and 
magnetic  field,  we  obtain  the  following  formulation  of  the  Maxwell  system  where  <  ,  >  denotes 
the  hermitian  product  on  V  and  (  ,  )  denote  the  hermitian  product  in 

(12)  (-curlF,  curlF')  -  k\(F,  F')  -(J„  F’}  =  <  (^  -  A)(j)  -  k'^P2(n  A  F),  F’  > 

(13)  0  =  <  (^  +  Pi)(n  A  F)  +  P2(j),  n  a/ > 


where  j  ^  H  /\n,  and  f  and  F'  are  test  functions. 

2.3.  General  solution  algorithm.  Here  we  use  an  algorithm  which  comes  from  the  sub- 
structuring  method  for  elliptic  problems.  In  our  case,  we  suppose  that  we  have  two  domains,  one 
is  n  and  the  other  its  exterior.  Of  course,  the  equations  modeling  the  exterior  are  the  integral 
equations  in  (12-13).  After  having  discretized  these  two  equations  with  a  finite  element  technique, 
we  obtain  the  following  linear  system  : 

([A  B  0  1  f  1  f  1  ^ 

(14)  F;.,  nA/].  F*  C-k‘^P2  1/2- Pi  n  A  Fr  -  0  =0 

VL  0  1/2  + Pi  P2  J  L  ;  J  L  0  J/ 

where  the  matrices  A,  B  C  represent  the  volumic  part  of  equation  (12)  and  the  matrices  Fi  and 
P2  represent  the  integral  operators  denoted  by  the  same  symbols.  Then  the  method  consist  in 
eliminating  the  unknowns  Fq  and  j  leading  to  the  following  system  . 

(15)  {Si  +  Se.)n  A  F  ^  B* A'^BJs 


857 


where  5,  =  C  -  B*A-^B  and  5,  -  -eP2  -  (1/2  -  Pi)P2~\ll2  +  Pi)  =  -P2~\\l2  +  Pi)  are 
usually  called  the  Schur  complements  associated  respectively  with  the  inside  and  the  outside  of 
rj.  In  order  not  to  fill  the  resulting  matrix,  one  commonly  uses  an  iterative  method  (generally  a 
conjugate  gradient  like  method)  for  the  linear  system  (15).  We  have  only  to  factorize  A  and  P2. 
Here,  with  just  two  subdomains,  the  gain  is  the  following.  If  the  global  system  is  well-conditioned, 
then  the  “Schur”  method  can  efficiently  use  the  best  solvers  for  the  sparse  system  inside  Q.  and 
the  dense  system  on  F. 

2.4.  A  fast  solver  for  an  axisymmetric  integral  equation.  In  this  part,  we  present  the 
direct  solver  we  are  developing  for  axisymmetric  integral  equations.  The  interest  of  embedding 
the  heterogeneous  body  in  an  axisymmetric  surface  is  the  following.  Because  of  the  symmetry, 
one  can  mesh  F  in  a  meridian- parallel  way  (see  fig  1).  Then,  numbering  the  degrees  of  freedom 


by  following  the  meridian  lines,  one  obtains  a  block-circulant  matrix,  each  block  corresponding 
to  the  interaction  between  two  meridians.  First  we  see  that  we  can  save  memory,  and  time  for 
the  computation  of  the  matrix.  Furthermore,  this  kind  of  matrix  can  be  block-diagonalized  using 
Fourier  transforms.  One  can  then  factorize  the  blocks.  All  this  machinery  can  be  efficiently 
parallelized.  Nevertheless,  a  meridian-parallel  mesh  as  one  main  drawback  :  the  triangles  of 
the  poles  are  degenerated.  In  order  to  treat  this  point,  we  developped  a  meshing  technique 
which  keeps  the  main  advantages  of  the  axisymmetric  surfaces  without  the  degenerated  elements. 
Furthermore,  this  technique  leads  to  algorithms  which  are  still  parallelizable. 

Let  us  first  develop  the  simplest  algorithm.  In  order  to  obtain  a  full  optimization,  we  impose 
the  number  of  meridian  slices  to  be  a  power  of  2.  With  the  numbering  previously  presented,  the 


matrix  M  writes  M  = 


A„-i 

A„_i 

Aq 

A„_2 

A2 

A) 

In  order  to  maintain  consistency  between  the  element  size  in  meridian  and  parallel  directions, 
one  must  keep  n,  =  0{k),  and  n  =  0[k)  where  k  is  the  wave  number,  n,  is  the  number  of 
parallels  and  n  is  the  number  of  meridians.  The  total  number  of  degrees  of  freedom  is  then 


N  =  n,,  X  n  =  0{k~). 
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It  is  a  well-known  fact  that  this  kind  of  matrix  can  be  block-diagonalized  with  a  FFT  type 
algorithm  (cf.  [5]).  Let  U  be  a  vector  of  degrees  of  freedom.  We  decompose  U  in  (t/o,  t/j, 

Ui  being  the  components  of  U  on  the  {i  —  1)*^  slice.  We  want  to  solve  V  =  AiU  and  we  also 
decompose  V  as  V  =  (Vq,  V\,  Ki-i)- 

We  state  a  few  more  definitions.  We  call  w  the  primitive  root  of  unity  and  a;  its  complex 
conjugate.  Then  we  set  T{W),  =  Y.  discrete  and 

1 

discrete  inverse  Fourier  transforms  of  (Wq,  W„_i).  We  have  V  =  Hence, 

for  each  k,  Uk  =  -y^{:F{A)l^^{V)k)k.  Then  the  direct  solver  algorithm  is  as  follows. 
n 

•  Compute  the  inverse  Fourier  transform  JF  of  V. 

•  Compute  the  Fourier  transform  JF  of  A. 

•  Compute  the  LU  factorisations  of  for  each  k. 

•  Perform  inversions  =  J^{A)l^f'{V)k- 

•  Compute  the  Fourier  transform  T  of  \J'  divided  by  n. 

We  evaluate  the  complexity.  For  the  factorisation,  we  have  n  LU  factorisations  of  size  n,,  it 
makes  operations.  For  the  solution,  we  have  2  FFT’s  and  n  inversions  of  size  n,.  It  makes 

(9(7V’^/2)  operations. 

Now  we  present  the  parallelization  of  the  Direct  Solver  Algorithm.  Let’s  first  recall  that 
iPSC860  is  a  distributed  MIMD  supercomputer  with  an  hypercube  network.  In  [5]  we  presented 
a  parallel  implementation  of  FFT’s.  In  this  paper  we  give  another  one  which  is  more  efficient  for 
our  purpose  {details  may  be  found  in  [7]). 

Let’s  say  for  simplicity  that  we  want  to  perform  a  block  Fourier  transform  on  n  processors. 
Here,  a  block  is  either  a  vector  subpart  Vi  or  a  block  submatrix  Ai.  Our  algorithm  needs  the  size 
m  of  a  block  to  be  a  multiple  of  the  number  of  processors  :  m  =  p.n.  So  we  have  to  perform  m 
FFT’s.  We  want  to  transfer  data  in  order  to  have  p  FFT’s  to  do  on  each  processor.  The  situation 
at  the  beginning  is  that  we  have  one  block  on  each  processor.  Each  block  Vi  will  be  divided  into 
n  sub-blocks  V/  of  size  p.  Now,  processor  i  has  to  send  V/  to  processor  j  for  all  j.  At  the  end  of 
this  communication  phase,  we  have  on  processor  i,  the  U/  sub- blocks  for  all  j.  Each  processor  can 
now  perform  local  FFT’s  on  the  sub-blocks.  Then  we  perform  the  inverse  transposition  algorithm 
to  obtain  the  blocks  T[V),  to  be  each  on  one  processor. 

We  are  now  going  to  explain  how  this  algorithm  is  implemented  on  an  hypercube  network  in 
order  to  reach  maximum  efficiency.  First,  we  recall  how  is  determined  the  path  from  processor 
i  to  processor  j  in  the  hypercube  network.  We  write  i  and  j  in  base  2  :  i  =  arfa,i_i...ai  and 
j  =  Let  c  =  be  a  “bitwise  exclusive  or”  6.  Then  we  construct  a  sequence 

(a.)  from  a  to  6  where  differs  from  a,_i  from  only  one  bit.  To  construct  this  sequence  we  simply 
change  a  bit  to  the  current  term  each  time  we  encounter  a  1  bit  in  the  decomposition  of  c  when 
we  go  from  Ci  to  q. 

To  perform  the  transposition  algorithm,  for  each  processor,  we  order  the  rest  of  the  nodes  of  the 
hypercube  in  the  following  way.  First,  we  put  in  a  same  class  all  the  nodes  which  have  the  same 
distance  from  the  reference  one.  Then,  in  each  class  we  order  the  nodes  with  a  lexicographical 
order  (in  fact  a  reverse  one).  Let’s  give  an  example  for  an  hypercube  with  8  processors,  for  node 
0,  “distance  1”  nodes  are  nodes  1,  2  and  4,  “distance  2”  nodes  are  nodes  3,  5  and  6  and  “distance 
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3”  node  is  node  7.  Then  for  processor  0  nodes  are  ordered  in  the  following  way  :  1  2  4  3  5  6  7. 
For  processor  1  it  is  an  exercise  to  verify  that  nodes  are  ordered  that  way  :  0  3  5  2  4  7  6. 

Now  for  each  processor,  we  begin  to  communicate  with  the  first,  then  the  second,  then  the  next 
...,  according  to  the  previously  defined  order.  One  can  verify  that  a  link  is  used  atmost  one  time 
in  each  direction  at  each  step.  Hence,  this  communication  scheme  does  not  create  any  contention 
problem  on  the  network.  In  the  following  tables  we  give  some  performances  we  have  reached 


Table  1 

Block  Circulant  System  on  64  nodes 


Performances  in 

Mflops  (64  bits)  with  1  rhs 

Block  dimension 

128  256  384  512 

Mflops 

585  997  1275  1433 

Table  2 

Block  Circ2ilant  System  on  64  nodes 

Performances  in  Mflops  (64  bits)  with  128  rhs 

Block  dimension 

128  256  384  512 

Mflops 

825  1173  1395  1515 

Table  3 

Block  Circulant  System  on  128  nodes 


Performances  in  Mflops  (64  bits)  with  1  rhs 


Block  dimension 

128 

256 

384 

512 

Mflops 

970 

1917 

2479 

2846 

Table  4 

Block  Circulant  System  on  128  nodes 


Performances  in 

Mflops  (64  bit)  with  128  rhs 

Block  dimension 

128 

256 

384 

512 

Mflops 

1530 

2308 

2743 

3012 

on  iPSC860.  We  measured  CPU  time  for  diagonalization,  factorization  and  solution  of  linear 
systems.  We  vary  the  cube  dimension,  the  block  size  and  the  number  of  right  hand  sides. 

The  next  point  is  to  show  how  to  avoid  sharp  triangles  at  the  poles.  This  method  is  developed 
in  great  details  in  [5].  We  start  at  the  poles  with  a  finite  number  of  triangles.  Then  until  the  width 
of  the  parallels  is  not  greater  than  a  criterion  based  on  the  wavelength,  we  carry  on  meshing  each 
parallel  as  before.  Once  the  criterion  is  reached,  we  divided  each  parallel  band  into  two  parallel 
sub-bands  and  we  cai  ry  on  dividing  or  regrouping  according  to  the  length  of  the  parallels.  Without 
going  into  very  intricated  details,  we  can  just  say  that  still  using  FFT  we  can  obtain  a  sparse 
matrix,  with  the  same  complexity  as  we  had  for  the  previous  algorithm.  The  matrix  has  a  filling 
pattern  looking  like  the  one  in  figure  (2).  In  that  case  we  began  with  four  elements  at  the  poles, 
and  we  divided  twice  the  parallels.  We  can  see  on  this  picture  that  the  filling  pattern  is  sky-line. 
Hence,  a  factorization  will  not  modify  the  memory  requirements.  Furthermore,  distributing  the 
diagonal  blocks  among  the  processors,  we  still  can  design  a  parallel  factorisation.  In  the  next 
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Fig.  2.  Pattern  of  the  matrix  with  two  divisions. 
table,  we  give  some  results  of  implementation  of  the  solver  on  the  iPSC860  . 


Table  5 

Performances  of  the  multilevel  solver 


Performances  in  Mflops  (64  bits) 

Number  of  processors 

8 

16 

32 

64 

Number  of  degrees  of  freedom 

1960 

3870 

5215 

16575 

Mflops 

170 

316 

546 

- 

Mflops  with  pivoting 

140 

271 

454 

1155 

3.  Parallelization  of  the  code  ALICE.  ALICE  is  a  3D  Maxwell  equation  solver,  devel- 
opped  by  the  Department  of  Physics  of  ONERA.  This  code  is  mainly  used  for  structures  struck  by 
lightning  but  its  application  domain  could  be  extended.  It  uses  a  FDTD  approach,  based  on  an 
explicit  finite  differences  Leap-Frog  type  scheme  in  space  and  time.  The  domain  of  computation 
contains  both  object  structures  and  wires  [10], 

The  parallelization  of  this  code  has  been  carried  out  by  splitting  the  structured  mesh  into  pencils. 
Each  pencil  is  allocated  into  a  processor.  This  code  is  scalable  on  any  grid  of  processors.  It  is 
written  in  Fortran  and  runs  with  32  bits  arithmetics. 

Some  performances  on  industrial  cases  are  given; 

•  A  metallic  box  (40  x  40  x  40)  embedded  in  a  80  x  80  x  80  volume. 

•  Launcher  1  on  its  launch  pad  :  205  x  136  x  112. 

•  Launcher2  on  its  launch  pad  ;  98  x  104  x  112. 

•  A  satellite  :  96  x  152  x  168. 

1  Wr  arc  indrl)te(l  for  those  rosult.s  to  F.  Clioiiklirouii 
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Table  6 

Performances  of  ALICE  on  PARAGON  and  CRAY-YMP  for  tOOO  time  steps 


Machine 

PARAGON  64PE 

CRAY-YMP  IPE 

Tq ray  ('^Paragon 

80^  Box 

52s 

184s 

3.5 

Launcherl 

267s 

771.87s 

2.9 

Launcher2 

84.4s 

286.175s 

3.4 

Satellite 

173.33s 

950s 

5.48 

Paragon  execution  times  are  very  attractive  compared  to  one-processor  CRAY-YMP.  Further¬ 
more,  the  larger  memory  of  the  PARAGON  machine  makes  some  very  big  industrial  applications 
exploitable:  for  instance,  the  case  Launcher2  runs  in  12  hours  with  a  (200  X  400  x  400)  domain 
{  1.2  Gbyte  )  for  25000  time  steps.  Such  a  problem  could  not  be  treated  with  the  CRAY-YMP 
machine 

4.  Conclusion.  The  parallel  computing  division  of  ONERA  has  two  kinds  of  activity  in  the 
field  of  computational  electromagnetism. 

The  first  kind  of  activity  is  research  of  new  numerical  methods  which  perform  well,  both  from 
the  numerical  and  parallel  computing  point  of  view.  That  lead  us  first  to  the  coupling  between 
integral  equations  and  volumic  ones,  and  secondly,  to  the  concept  of  multilevel  meshes  and  solvers. 
Then,  we  try  to  adapt  the  method  to  obtain  an  algorithm  which  is  as  parallel  as  possible  without 
spoiling  the  performances  in  terms  of  speed  of  computation.  In  our  case,  we  reached  for  the 
one-level  solver  3  GFlops  on  128  processors  of  an  iPSC860  and  1.1  GFlops  for  the  multi-level  one 
on  64  processors. 

The  second  kind  of  activity  consists  of  development  of  parallel  versions  of  existing  codes,  in 
collaboration  with  other  departments  of  ONERA  or  with  industrial  partners.  We  have  given 
an  example  of  such  a  development  for  the  code  ALICE,  that  is  nowadays  routinely  exploited 
on  the  66-processor  PARAGON  machine  at  ONERA.  This  development  allows  engineers  of  the 
department  of  Physics  to  use  finer  numerical  models  for  predicting  the  electromagnetic  beheaviour 
of  structures  struck  by  lightnings. 
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1.  Iiitroductioti 

The  edge-based  finite  element  method[l]  is  an  imiiortant  numerical  method  for  tin' 
electioinagnetic  modeling  of  complex  geometries.  The  computation  domain  is  discretized  into 
tetrahedra  where  the  unknowns  are  a.ssociated  witfi  the  edges  of  the  tetrabedra.  This  gives  ri,se 
to  a  system  of  equations  witli  potentially  millions  of  unknowns,  for  large  three-dimensional  (31)) 
geometries.  The  solution  of  tiic  resulting  matrix  equation  is  computationally  intensive  aiul  rixpiires 
parallelization  In  order  to  be  solved  practically. 

In  this  paper,  we  address  the  parallelization  of  an  iterative  matrix  solver,  the  (Juasi-Minimal 
Residual  (QMR)  algorithm  [2,  3],  on  distributed  memory  multiprocessors  such  as  the  Intel  Delta[4}, 
where  tiie  ju-ocessors  are  interconnected  in  a  2D  mesh  topology  with  wormhole  routing.  The  QMR 
is  a  recently  introduced  method  for  the  solution  of  general  non-Herrnitian  linear  systems  /lx  =  6. 
During  the  parallel  solution  of  this  system  of  equations,  tlie  rows  of  A  and  conx'sponding  entries 
of  the  vectors  are  distributed  to  the  processors  of  the  mesh.  Tlie  i)roccssors  need  to  communicate 
to  exchange  the  required  data  available  on  other  processors. 

The  most  time-consuming  operation  in  the  QMR  algorithm  is  the  matrix-vector  multiidication, 
which  requires  extensive  interprocessor  communication.  If  the  partitioning  algorithm  yields  a 
banded  matrix  /I,  then  only  nearest  neighbor  communication  is  required.  However,  our  matrices 
may  not  be  banded  and  each  processor  may  need  to  communicate  with  several  other  j^rocessors 
that  are  not  necessarily  adjacent  to  it,  depending  on  the  domain  to  be  decomiio.sed  aiul  the 
decom])osition  method.  VVe  have  implemented  this  communication  pattern  with  an  algorithm 
which  we  call  the  Most  Messages  hirst  (MMF)  algoritliin,  and  we  use  it  in  the  pfiralh'l  solution 
of  the  QMR  method  on  the  Intel  Touchstone  Delta. 
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hi  this  paper,  Section  II  lists  the  QMFl  algorithm.  Section  III  describes  the  computation  and 
communication  requirements  of  this  algorithm  and  provides  the  details  of  the  MMF  algorithm. 
Finally,  Section  IV  shows  the  results  and  speedups  obtained  for  the  problem  of  electromagnetic 
scattering  from  a  conducting  sphere. 


II.  The  Quasi-Minimal  Residual  (QMR)  method 


The  QMR  algorithm  is  an  iterative  method  for  solving  linear  systems  of  equations  Ax  —  b 
[2,  3].  Although  the  QMR  method  can  solve  general  nomHermitian  linear  systems,  we  will  only 
use  the  simplest  form  of  the  QMR  algorithm  in  this  paper  to  illustrate  its  parallelization  and 
communication  requirements.  We  will  use  upper  case  letters  to  refer  to  matrices,  lower 

case  letters  (p,  9,  n,  le,  d,  x)  to  refer  to  vectors  and  Greek  letters  ^,u>)  to  refer  to 

scalars.  The  subscript  n  refers  to  the  iteration  number  of  the  QMR  algorithm. 

Algorithm  1  (The  QMR  algorithm)  ; 

0)  Choose  Xq  €  and  set  vq  —  b  -  Axq. 

Compute  Pi  —  ||ro||  and  set  tq  =  Vo/pi- 
Choose  wi  E  with  linq||  =  1. 

Set  po  =  r/o  =  f/o  =  0,  oo  =  fo  =  G  =  T  ^0  =  0, 7/0  =  —  1 . 

For  n  =  1, 2,  •  •  • ,  do: 

})  //cn-i  =  0,  then  stop. 

Compute  Sn  =  tolvn.  If  ^  0,  then  stop. 

2)  Compute 

Pn  =  Vn-  Pn-\{Cnhn/(n-\). 
q„  =  Wn  ~  qn-\{Pnbn/fn-\), 

3)  Compute  e,,  =  q^ApnJhi  =  ^n/hn,  and  set 

=  Apn-Vn(Ci,  Pn+1  =  11^71+1  I!, 

A^  qji  —  'Wnfdn.,  ^n+1  — 


/f)  Compute 


<^n  +  lPn+l  ^  _ 

'dji  —  I  /J  I  ’  — 


Vn  —  Vn-l  n 


dn  =  Pnihi  +  Xn  =  +  d^ 


5}  If  p„+i  =  0  or^n+\  -  0,  then  stop. 
Otherwise,  set 


^n+l  —  hn+1  /  Pn-\-\ )  /Gi+1  • 
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III.  Computational  Requirements  of  the  QMR  algorithm 


The  computational  recpiiremeiits  of  the  QMR  algorithm  can  be  classified  into  three  types. 

1.  The  first  type  involves  the  product  of  a  matrix  and  a  vector,  such  as  the  product  of  A 

and  vector  and  the  product  of  and  vector  qn-  We  will  refer  to  this  type  as  the 

Matrix- Vector  (MV)  type. 

2.  The  second  tyj^e  is  characterized  by  the  product  of  two  vectors,  such  as  the  calculation  of  dot 
products  {'wJ^Vn),  and  vector  norms  (||7j\+i|l).  We  will  refer  to  this  type  as  the  Dot-Product 
(DPr)  type. 

3.  The  third  type  involves  the  addition  of  two  vectors  scaled  each  by  a  scalar  constant,  such 

as  dn  =  PnVn  +  )^.  Tliis  type  will  be  referred  to  as  Scalar-Alpha-X-Pius-Beta-Y 

(SAXPBY)  [2,  3]  type. 

The  matrix  A  and  the  vectors  are  distributed  to  the  processors  of  the  mesh  using  a  partitioning 
scheme  that  load  balances  the  data  in  the  processors  and  minimizes  processor  interactions.  The 
processors  should  be  load  balanced  so  that  they  finish  computation  at  approximately  the  same 
time.  Communication  time  should  be  minimized  since  it  increases  the  total  execution  time  of 
the  parallel  QMR.  We  used  the  Adaptive  Jumps  (AJ)  method  [5]  to  decompose  the  nmtrix  and 
vectors  among  the  processors,  and  decrease  the  total  volume  of  communication  required  for  the 
parallel  execution  of  QMR. 

A.  Parallelizing  the  MV  type 
1.  The  product  A.p^ 

During  the  parallel  calculation  of  A./;,,,  the  processors  need  to  communicate  to  exchange  the 
required  entries  of  the  vector  that  are  mapped  to  other  processors.  Figure  1  shows  ati  example 
of  the  distribution  of  the  matrix  A  aiul  vector  to  4  processors  fo.  Pi,  P2,  and  P3.  Each  processor 
may  need  to  communicate  with  s('veral  other  processors  that  are  not  necessarily  adjacent  to  it.  The 
resulting  communication  pattern  is  very  irregular.  It  depends  on  the  sj^arsity  of  A,  and  on  tlie  way 
it  is  partitioned.  This  pattern  of  communication  is  referred  to  as  the  “All-To-Many  Personalized 
Communication’’  (ATMPC)  [6].  The  '' all-to-many"  pattern  implies  that  each  processor  needs  to 
communicate  with  only  a  few  other  processors. 

One  way  to  imi^lement  the  ATMPC  is  to  use  All-To-All  Personalized  Communication  (ATAPC) 
algorithms  [7,  8,  9,  10,  1 1].  The  ATAPC  pattern  is  a  regular  pattern  in  which  each  processor  needs 
to  send  a  personalized  message  to  every  other  processor.  The  ATMPC  can  be  im]>lemented  using 
the  ATAPC  pattern  by  sending  zero-length  messages  between  two  j)rocessors  that  do  not  actually 
need  to  communicate.  However,  this  scheme  is  not  optimal;  particularly  if  each  proc(\s.sor  needs  to 
communicate  with  only  few'  other  processors.  It  will  leave  most  of  the  processors  and  links  of  the 
multiprocessor  idle.  On  the  other  hand,  if  each  processor  randomly  sends  its  messages,  deadlock 
can  occur.  During  deadlock,  the  processors  wait  on  each  other,  and  no  one  can  actually  receive 
the  messages  that  are  destined  to  it. 


866 


I  2  3  4  3  6  7  8  9  10  11  12  13  14  15  16 


*  belong  to  Pq 
0  belong  toPj 
X  belong  to  P2 
+  belong  to  P3 

Figure  1;  The  distribution  of  matrix  A  and  vector  to  4  processors  Po,  PxyPi,  and  P3. 

The  Most  Messages  First  algorithm 

Our  approach  to  implement  the  ATMPC  pattern  of  communication  on  the  2D  mesh  is  a  greedy 
approach.  We  scliediile  the  processor  that  has  the  most  messages  first.  We  group  the  messages 
into  several  groups.  Messages  within  the  same  group  are  sent  simultaneously,  and  form  one  step 
of  the  ATMPC  algorithm.  We  attempt  to  schedule  as  many  messages  as  possible  during  each 
step  of  the  ATMPC  in  order  to  decrease  the  maximum  number  of  steps  (MNS)  of  this  irregular 
communication.  We  call  our  scheduling  algorithm,  the  “Most  Messages  First”  (MMF)  algorithm. 
This  algorithm  specifies  which  processors  should  communicate  at  each  step,  as  well  as  the  total 
number  of  steps  of  the  ATMPC  algorithm.  It  is  run  only  once  for  a  given  distribution.  The 
MMF  algorithm  is  based  on  the  assumption  that  each  processor  can  send  and  receive  at  most  one 
message.  Therefore,  multiple  messages  from  (to)  a  processor  should  be  sent  (received)  sequentially. 
The  minimum  number  of  steps  of  the  ATMPC  algorithm  is  therefore  the  maximum  number  of 
messages  that  any  processor  has  to  send. 

The  scheduling  procedure  is  based  on  a  communication  array  C  {p  x  p)  that  is  available  for 
the  given  distribution  of  A  to  the  processors  (p  =  number  of  processors).  C  is  set  such  that  entry 
{i,j)  gives  the  length  of  the  message  that  processor  i  needs  to  send  to  processor  j.  If  C  is  not 
sparse,  then  an  ATAPC  algorithm  should  be  used.  When  C  is  sparse,  the  null  entries  are  removed 
from  it,  and  its  rows  are  compressed  to  the  left.  The  compressed  C  matrix  is  distributed  among 
the  processors.  Each  processor  stores  only  the  compressed  row  that  indicates  the  messages  it 
should  send.  The  compressed  row  of  C  on  each  processor  Pi  is  stored  as  two  linear  arrays  M L{j) 
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(Message  Length)  and  DP{j)  (Destination  Processor).  M L{j)  is  the  length  of  tlie  message  tliat 
processor  P,  lias  to  send  to  processor  DP{j).  We  use  the  notation  {M  L{j),  D  P{j))  to  represent 
this  message.  Furthermore,  NS,  is  the  number  of  steps  processor  Pi  reciuires  to  send  its  messages, 
and  RMi  is  the  numher  of  remaining  messages  that  processor  Pi  lias  to  schedule.  ALS’,  and  PA/, 
are  initialized  to  the  number  of  nonzero  messages  P,  has  to  send.  The  MMF  algorithm  rearranges 
the  entries  in  the  DP  and  ML  arrays,  such  that  at  the  end  of  this  algorithm,  entries  DP{j)  and 
M L{j)  on  ail  the  processors  repre.sent  a  group  of  messages  to  be  sent  simultaneously  at  stej)  j  of 
the  ATM  PC.  Furthermore,  MNS,  the  maximum  number  of  steps  of  the  ATM  PC  algorithm,  is 
set  to  be  equal  to  the  largest  NS'i  over  all  processors  {/  =  0,  •  •  •  ,  p  ~  1 ). 

Let  us  choose  processor  Po  to  coordinate  the  scheduling  process.  At  each  step  j,  ])rocessor 
Po  sorts  the  processors  by  the  number  of  unscheduled  messages  they  need  to  send.  Processor  Pq, 
then,  goes  over  this  list  in  a  descending  order,  and  allows  each  processor  to  schedule  a  message 
at  the  current  step  j.  Tins  technique  gives  the  processor  with  the  largest  number  of  messages  a 
better  chance  to  schedule  a  message;  therefore,  it  keeps  MNS  as  close  to  the  minimum  as  possible. 

When  a  processor  P,  is  ready  to  schedule  a  message  for  step  j  of  the  ATM  PC,  it  checks 
the  entry  of  its  arrays  DP  and  ML.  If  DP{j)  is  not  receiving  any  messages  at  step  j,  the 
message  (A/L(j),  PP(j))  is  scheduled  at  this  step,  and  processor  PP(j)  is  masked.  The  number  of 
remaining  messages  PA/,  in  P,  is  decremented.  However,  if  DP{j)  is  masked,  the  MMF  algorithm 
goes  to  the  end  of  the  DP  array,  that  is,  it  considers  entry  DP{NSi).  If  DP{NSi)  is  also 
masked,  the  algorithm  considers  DP{NSi  —  1),  DP{NSi  —  2),  •  -  •,  until  it  either  huds  a  processor 
DP{i\'Si  —  k)  that  is  not  masked  or  it  gets  to  DP{j)  again.  In  the  former  case,  the  entries 
DP{NS,  —  k)  and  DP{j)  are  exchanged.  Similarly,  the  entries  ML{IVSi  —  k)  and  M  L{j)  are 
exchanged.  Therefore,  the  message  {ML{i\Si  —  /;), //P(  ALS',  —  k))  is  scheduled  at  the  step, 
and  PA/,  is  decremented.  The  message  {M L{j),  DP{j))  is  moved  temporarily  to  the  (ALS',  —  ky^’’ 
step.  In  the  latter  case,  if  DP{j)  is  reached  again,  then  there  is  no  messages  in  processor  Pi 
that  can  be  sent  at  stej)  j.  The  message  {M L{j),  DP{j))  is  moved  to  the  end  of  the  DP  and 
ML  arrays,  hence  it  is  stored  at  location  NSi  +  1.  As  a  result,  ISIS,  is  incremented,  but  PA/,  is 
not  changed.  Entries  DP{j)  and  M L{j)  are  replaced  by  —1  to  indicate  that  processor  Pi  is  not 
sending  any  message  at  step  j  of  the  ATM  PC.  Moreover,  processor  P,  informs  processor  0,  of  the 
remaining  number  of  messages  it  has  to  send. 

Processor  Po  updates  the  mask  array,  and  sends  it  to  the  next  processor  in  tlie  sorted  list. 
This  is  rei)eated  until  all  the  processors  had  a  chance  to  schedule  a  message  for  step  j  of  the 
ATM  PC.  Once  step  j  is  scheduled,  j  is  incremented,  processor  Po  sorts  the  |)rocessors  by  the 
number  of  remaining  messages  they  need  to  send  and  the  above  algorithm  is  re|)eated  again  to 
schedule  the  messages  of  a  new  step  of  the  ATM  PC.  When  alt  the  messages  on  every  processor  arc 
scheduled,  the  maximum  number  of  steps  of  the  ATMPC  (MNS)  is  set  to  be  equal  to  the  largest 
A' 5,,  ?  =  0,  •  •  -  p  —  1.  After  schetiuling  the  ste]>s  of  the  ATMPC  by  the  MMF  algorithm,  they  are 
used  re])eated]y  each  time  A.p,i  is  computet!  or  each  time  an  ATMPC  pattern  is  invoked. 

2.  The  product 

Since  each  processor  stores  some  rows  of  A  and  not  its  columns,  A^  is  not  readily  available 
on  the  processors.  In  order  to  calculate  t  =  A^.r/„  efficiently  wuthout  affecting  the  speedup,  its 
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computation  and  communication  times  should  not  be  more  expensive  than  those  of  A.pn- 

We  compute  the  product  A'^.Qn  using  the  results  of  the  MMF  algorithm.  No  additional 
preprocessing  time  is  required.  Each  processor  calculates  its  contribution  to  the  nonzero  values  of 
vector  t.  Those  values  correspond  to  the  column  indices  of  the  nonzero  values  of  A  stored  in  that 
processor.  Then  each  ])rocessor  collects  only  the  nonzero  entries  of  the  t  vector  corresponding  to 
the  rows  it  is  storing.  Tlie  communication  ])attern  required  to  collect  vector  t  after  calculating  it 
[t  =  A'^ .(]n)  iu  each  processor  is  the  same  as  the  one  used  to  collect  vector  prior  to  multiplying 
it  by  A.  The  steps  of  the  ATMPC  scheduled  by  the  MMF  algorithm  in  the  preprocessing  phase, 
are  used  again  for  multii)lying  A^  by  f/„. 

B.  Parallelizing  the  DPr  type 

The  type  DPr  includes  the  computation  of  dot  products  (re„.u„)  and  vector  norms  (tju„+i||). 
Since  the  vectors  are  distributed  among  the  processors,  each  processor  coinputes  the  partial  dot 
product  or  partial  norm  on  the  portion  of  the  vectors  that  it  has.  Then  a  Global  Combine 
o])eration  [12]  is  needed  to  collect  the  partial  results  from  all  the  processors,  compute  the  final 
dot  product  or  norm,  and  then  distribute  the  result  back  to  all  the  processors. 

C.  Parallelizing  the  SAXPBY  ixjpe 

Parallelizing  the  SAXPBY  operations  is  the  simplest  among  the  three  types.  Each  processor 
scales  and  adds  the  j)ortions  of  the  vectors  it  is  storing.  No  communication  is  needed  among  the 
processors. 


IV.  Results  and  Speedup 


The  parallel  algorithm  was  tested  for  several  examples  on  the  Delta  multiprocessor  [4].  At 
this  point,  the  most  extensive  testing  of  the  parallel  code  has  been  for  the  canonical  problem  of 
electromagnetic  scattering  from  a  perfectly  conducting  sphere  of  radius  1.2  wavelengths  where  a 
simple  first  order  absorbing  boundary  condition  (ABC)  is  used.  The  number  of  unknowns  used 
for  this  example  is  197,574.  The  radar  cross  section  result  is  compared  against  the  Mie  series 
solution  in  Figure  2.  The  results  are  good,  given  that  the  truncation  boundary  is  0.3  wavelengths 
from  the  sphere.  The  corresponding  speedup  results  are  shown  in  Figure  3.  The  memory  per 
processor  was  not  enough  to  run  these  examples  with  less  tlian  8  processors,  so  the  entries  in 
Figure  3  are  normalized  with  respect  to  the  results  obtained  with  8  processors.  The  speedup 
obtained  while  running  the  QMR  on  IG  processors  is  1.96  as  compared  with  QMR  running  on 
8  processors.  This  is  a  very  good  result  that  approaches  the  ideal  value  of  two.  As  the  number 
of  processors  increases,  the  deviation  of  the  speedup  from  the  ideal  becomes  more  obvious.  This 
behavior  is  expected  since  the  communication  time  becomes  more  significant  as  the  amount  of 
computation  on  each  processor  is  reduced. 
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Figure  2;  RCS  for  a  1.2  wavelength  sphere.  Coiii])arison  of  the  FEM  solution  terminated  by  ai 
ABC  to  a  series  solution 


Figure  3:  Execution  time  and  speedup  for  the  sphere  example  (197,574  variables)  when  run  on 
the  Intel  Touchstone  Delta. 


V.  References 


[1]  J.  F.  Lee  and  R.  Mittra,  “A  note  on  the  application  of  edge-elements  for  modeling  three- 
dimensional  inhomogeneously-filled  cavities,”  IEEE  Transactions  on  Micvowavc  Theory  and 
Techniques,  vol.  40,  pp.  1767-1773,  September  1992. 

[2]  R.  W.  Freund  and  N.  M.  Nachtigal,  “An  implementation  of  the  QMR  method  based 
on  coupled  two-term  recurrences,”  Numerical  Analysis  Manuscript  92-06,  Al&T  Bell 
Laboratories,  1992. 

[3]  R.  W.  Freund  and  N.  M.  Nachtigal,  ‘implementation  details  of  the  coupled  QMR  algorithm,” 
Numerical  Analysis  Manuscript  92-12,  AT&T  Bell  Laboratories,  October  1992. 

[4]  Intel  Corporation,  “A  Touchstone  DELTA  system  description,”  Febrmiry  1991. 

[5]  L.  Hamandi,  R.  Lee,  and  F.  Ozgiiner,  “A  domain  decomposition  technique  for  the  parallel 
solution  of  linear  systems  of  equations  resulting  from  finite  element  discretization,”  in 
The  First  International  Conference  on  Electronics,  Circuits,  and  Systems,  (Cairo,  Egypt), 
pp.  978-983,  December  1994. 

[6]  S.  Ranka,  J.  C.  Wang,  and  M.  Kumar,  “Personalized  communication  avoiding  node  contention 
on  distributed  memory  systems,”  in  1993  International  Conference  on  Parallel  Processing, 
pp.  1241-1244,  1993. 

[7]  D.  S.  Scott,  “Efficient  all-to-all  communication  patterns  in  hypercube  and  mesh  topologies,” 
in  Proceedings  of  the  6'^  Distributed  Memory  Concuirent  Computers,  pp.  398-403,  1991. 

[8]  S.  H.  Bokhari  and  H.  Berriman,  “Complete  exchange  on  a  circuit  switched  mesh,”  in  Scalable 
High  Performance  Computing  Conference,  pp.  300-306,  April  1992. 

[9]  S.  Gupta,  S.  Ilawkinson,  and  B.  Baxter,  “A  binary  interleaved  algorithm  for  complete 
exchange  on  a  mesh  architecture,”  tech,  rep.,  Intel  Corporation,  Supercomputer  Systems 
Division,  Beaverton,  OR,  1993. 

[10]  S.  Takkelaand  S.  Seidel,  “Broadcast  and  complete  exchange  algorithms  for  mesh  topologies,” 
Tech.  Rep.  93-04,  Department  of  Computer  Science,  Michigan  Technological  University, 
Houghton,  Michigan,  November  1993. 

[11]  R.  Thakur  and  A.  Choudhary,  “All-to-al!  communication  on  meshes  with  wormhole  routing,” 
in  Proceedings  of  the  International  Parallel  Processing  Symposium,  April  1994. 

[12]  M.  Barnett,  R.  Littlefield,  D.  G.  Payne,  and  R.  van  de  Geijn,  “Global  combine  on  mesh 
architectures  with  wormhole  routing,”  in  Proceedings  of  the  7th  International  Parallel 
Processing  Symposium,  IEEE  Computer  Society  Press,  April  1993. 


871 


Advanced  Parallel  Solver  Techniques 


Adrian  S.  King 

Intel  Supercomputer  System  Division 
CO6-10,  Zone  8 
14924  NW  Greenbrier  Pkwy 
Beaverton,  OR  97006 


A  key  clement  to  the  future  of  Computational  Electromagnetics  (CEM)  is  the  development  of  advanced 
solver  algorithms  for  a  variety  of  systems  of  equation  types  such  as  sparsc/dense,  symmelric/unsymmctric, 
and  real/complex.  Success  here  will  require  major  advances  in  mathematics  in  order  to  solve  important 
electromagnetics  problems.  The  magnitude  of  the  problem  at  hand  is  demonstrated  by  looking  at  the  time 
required  today  to  solve  large  complex  dense  linear  systems  of  various  sizes  utilizing  current  LU 
decomposition  techniques  at  a  sustained  TcraFLOP  rate. 


Number  of  Unknowns  (Millions) 


TeraFLOPS  solution  time  today  versus  linear  system  .size  utilizing  current  EU  decomposifirm 
techniques 

Clearly,  alternatives  to  such  "brute-force"  solutions  must  be  pursued.  This  presentation  will  describe 
advanced  solver  development  efforts  and  their  parallel  implementations.  Both  iterative  and  direct  solution 
techniques  w'ill  be  addressed. 
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Abstract  -  Development  of  code  for  solving  radiation  problems  using  the  Finite  Difference  Time  Domain  (FDTD)  method 
requires  consideration  of  three  major  aspects.  These  are  1)  the  core  FDTD  algorithm,  2)  the  absorbing  boundary  condition 
(ABC)  and  3)  the  near  to  far  zone  transformation.  In  this  paper,  methods  of  parallelizing  each  of  these  aspects  are 
discussed.  The  computer  used  is  a  CM-5  (connection  machine)  with  32  processors,  each  with  four  vector  units  (VU).  The 
programming  language  used  is  CM  Fortran.  Performance  of  the  parallelized  code  for  the  various  parts  of  the  FDTD 
approach  is  compared  with  existing  serial  code.  The  results  of  this  comparison  are  encouraging.  The  parallel  core  FDTD 
algorithm  run  on  the  CM-5  is  found  to  run  approximately  100  times  faster  than  the  Fortran  77  code  on  a  SUN  SPARC-2 
workstation.  The  parallel  code  determines  radiation  patterns  about  27  times  faster  than  the  serial  code. 

Introduction 

The  Finite  Difference  Time  Domain  method  (FDTD)  has  been  extensively  studied  and  successfully  applied  to  many 
electromagnetic  problems,  for  example  [1].  The  technique  is  relatively  simple  to  implement  and  can  be  applied  to  problems 
with  complicated  geometries  and  inhomogeneous,  anisotropic  materials.  However,  its  computer  intensive  nature  can  limit 
its  usefulness  as  a  design  tool.  Large  problems  can  take  several  hours  to  run  on  a  high  performance  workstation.  The 
development  in  parallel  computing  techniques,  especially  the  development  of  MPC  (massively  parallel  computers)  provides 
a  way  to  reduce  computer  run  time.  Such  techniques  have  attracted  the  attention  of  the  electromagnetic  community  [2-4]. 

Parallelizing  the  FDTD  for  a  variety  of  computer  architectures  has  been  reported.  A  HP-735  workstation  cluster  is  used  in 
[2],  the  ASP  (Associative  String  Processor)  which  is  a  SIMD  (single  instruction,  multiple  data)  machine  is  used  in  [3],  and 
a  HP400,  a  486-based  PC  and  a  386-based  PC  with  transputer  arrays  are  used  in  [4].  As  different  parallel  computer  systems 
have  different  structures,  the  programming  strategies,  and  even  the  programming  languages,  are  also  different.  In  these 
references  a  method  for  parallelizing  the  near  to  far  zone  transformation  has  not  been  described. 

In  this  paper  a  method  of  parallelizing  the  FDTD  approach  for  antenna  analysis  is  discussed.  The  computer  used  is  a  CM-5 
(connection  machine)  which  has  32  processors.  The  programming  language  used  is  CM  Fortran.  The  three  major  steps 
involved  in  solving  radiation  problems  by  the  FDTD  method,  that  is  the  FDTD  algorithm,  the  absorbing  boundary  condition 
(here  Mur’s  ABCs  are  used),  and  the  near  to  far  zone  transformation  are  discussed  in  detail.  The  development  of  the  code 
using  CM  Fortran  and  the  results  obtained  are  presented. 

CM-5  and  CM  Fortran 

The  CM-5  is  the  connection  machine  produced  by  Thinking  Machines  Corporation,  USA.  A  CM-5  has  a  number  of 
processors  with  each  processor  having  four  vector  units  (VU).  Each  VU  has  its  own  memory.  The  CM-5  has  a  scalable 
architecture  which  allows  the  number  of  processors  to  be  increased.  The  CM-5  used  here  has  32  processors  and  a  total  of 
128  VUs.  Each  VU  has  8  Mbit  memory  and  so  the  computer  has  1024  Mbit  memory.  Users  can  choose  to  use  either  16  or 
32  processors, 

The  programming  language  used  in  the  present  work  is  CM  Fortran  which  is  Thinking  Machine's  version  of  the  Fortran 
language.  It  is  very  similar  to,  and  in  the  future  will  incorporate,  the  High  Performance  Fortran  (HPF)  subset  standard.  It  is 
a  subset  of  Fortran  90,  with  two  important  extensions.  The  first  is  a  set  of  compiler  directives  (which  appear  to  other 
compilers  as  comments)  that  specify  how  data  is  to  be  distributed  over  the  processor,  the  second  is  the  FORALL  statement, 
(which  was  originally  proposed  for  but  then  dropped  from  the  Fortran  90  standard).  In  particular,  parallelism  is  explicitly 
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represented  by  the  programmer  using  Fortran  90  array  operations.  No  attempt  is  made  to  parallelize  DO  loops. 

Normally,  the  bottleneck  for  the  CM  Fortran  is  the  time  consuming  communications  between  processors.  The  main 
programming  issue  is  to  avoid  such  communications,  and  if  they  are  unavoidable  make  them  most  efficient. 

The  CM-5  can  be  viewed  as  consisting  of  a  number  of  virtual  processors,  each  with  its  own  memory.  If  no  axis  of  an  array 
is  specified  as  serial,  every  element  of  the  array  is  stored  in  a  separate  virtual  processor's  memory.  In  this  case,  if  two  arrays 
have  the  same  shape  and  size,  then  elements  having  the  same  subscripts  will  be  stored  in  the  same  virtual  processor's 
memory  and  so  the  operations  on  them  can  be  performed  without  any  communications.  If  the  program  operates  on  two 
elements  which  have  different  subscripts  then  the  compiler  will  generate  communication  code  even  when  the  elements  are 
stored  in  the  same  VU,  (memory  allocation  for  the  distributed  memory  is  a  run  time  property). 

Another  important  consideration  when  using  the  CM-5  is  the  array  size.  In  conventional  serial  code  run  time  is  directly 
related  to  array  size.  In  the  CM-5  there  exist  optimum  array  dimensions,  For  example  16x16  element  arrays  can  be 
processed  as  quickly  and  with  the  same  memory  requirements  as  9x9  element  arrays.  The  consequence  of  this  is  that 
larger  problems  can  be  tackled  or  the  absorbing  boundaries  can  be  moved  further  away  from  the  radiating  structure  without 
penalty  in  many  cases.  Detailed  description  of  this  feature  can  be  found  in  the  CM  Fortran  handbook  for  CM-5  [5]. 

Parallelizing  FDTD  Algorithm 

The  FDTD  method  is  a  direct  solution  of  Maxwell's  time-dependent  curl  equation.  In  an  isotropic  medium.  Maxwell's 
equation  can  be  written  as 

VxE  =  -p-^,  VxH  =  oE-t-E-^  (1) 

oi  dt 

In  a  Cartesian  coordinate  system  (.t,y.2j,  vector  fields  E  and  H  can  be  expressed  as  E  =  +>’£,. -l-££^  and 

H  =  xH^  +yf'fy  +zH^ ,  and  so  the  above  two  vector  equations  can  be  decomposed  into  six  scalar  equations,  which  can  be 
discretized  by  centra!  finite-difference  approximation,  and  the  six  commonly  used  FDTD  equations,  (2),  can  be  obtained. 


[6]. 

J  (2a) 

H^^'HJ.k)  =  kfy(iJ,k)  +  Ri,[E"{i+]J,k)-E"(iJ,k)+E"ii,j,k)-E”iiJ,k  +  l)]  (2b) 

kf^*'(ij,k)=  H^(iJ,k)  +  R^[E''JiJ  +  l,k)-E"{iJ,k)+E”(iJ,k)-E"{i  +  ]J,k)j  (2c) 

Er'(ij.k)=c,,E:iij,k)+c„[H^^^'(ij,k)-H:^'iij-i,k)+H:.^'iij,k-\)-H:^'(ij,k)]  ad) 

E:^\ij,k)^c^E:(ij,k)+c,[H";'(ij,k)-H':^'{ij,k-\)+H:^'ii-\,j,k)-H:*'(ij,k)]  a^) 

E:^\ij,k)  =  c^E:{ij,k)+c,[H;*'(ij,k)-H;^'(i-]j,k}+H:^'iij-\,k)-H"/\ij,k)]  (20 

''^here  =  6r/p(/,y,  A-)5  C„  =  I -a(i,y,A)5f/e(/,y,A)  =dt/e{ij,k)5  (3) 


Also  5  =  6,v  =  5v  =  5z  (because  cubic  cell  is  used  here  ),  and  6/  is  the  time  increment.  The  above  six  FDTD  equations  form 
the  largest  computational  part  in  FDID  method.  Parallelization  of  these  equation  involves  several  steps  which  are  now 
described. 

Step  1.  Define  arrays.  There  are  24  different  terms  in  (2),  we  define  each  term  as  a  separate  array.  Let  nal  and  n  represent 
superscripts  n  +  I  and  n  respectively.  Also  let  ial,  isl,  jal,  jsl,  kal  and  ksl  represent  subscripts  i-l-1,  i-l,  ;  +  l, 
>  - 1 ,  A  -I- 1  and  A  -  1  respectively.  The  FDTD  space  domain  is  bounded  by  a  rectangular  box  x  =  (0,6/^^, ),  y  ~  ) 

and  ;  =  ),  In  the  following  Imax,  Jmax  and  Kmax  are  the  upper  bound  for  i,j  and  A  respectively.  The  24  arrays 

may  be  defined  as 

REAL,  ARRAY  {0:lmax,  0;Jmax,  0:Kinax):: 

&  Ex_nal,  Ex_n,  Ex_n_jal,  Ex_n_kal,  Hx_nal,  Hx_n,  Hx_nal_jsl,  Hx_nal_ksl, 

Sc  Ey_nal,  Ey_n,  Ey_n_ial,  Ey_n_kal,  Hy_nal,  Hy_n,  Hy_nal_isl,  Hy_nal_ksl, 

&  Ez_nal ,  Ez_n ,  Ez_n_ial,  Ez_n_jal,  Hz_nal,  Hz_n.,  Hz_nal_isl,  Hz_nal_jsl 

Here  each  term  is  defined  as  a  separate  array  to  improve  the  readability  of  this  paper.  How-ever  by  reusing  names,  the 

number  ot  the  arrays  may  be  reduced  to  eight  to  reduce  memory  requirements. 

To  implement  (2)  in  parallel,  the  size  and  shape  of  all  of  the  arrays  should  be  the  same  as  defined  above,  so  that  elements 


874 


having  the  same  subscripts  will  be  stored  in  the  same  location  of  the  virtual  processor’s  memory.  This  helps  to  make  the 
communications  simple  thus  improving  the  efficiency  of  the  code. 

Step  2.  Update  the  field  component  values,  using  the  information  at  current  time  step  is  n+1,  that  is 

Ex_n  =  Ex_nal 
Ey_n  =  Ey_nal 
Ez_n  =  Ez_nal 
Hx_n  =:  Hx_nal 
Hy_n  =  Hy_nal 
Hz_n  =  Hz_nal 

Step  3.  Move  the  data.  The  six  FDTD  equations  involve  large  amount  of  communications.  For  example,  in  (2a)  six 
elements  are  involved;  the  E"^{iJ,k),  E"(iJ,k),  E"(i,j,k  +  ])  and  E"^iiJ  +  \,k)  components. 

The  first  four  elements  have  the  same  subscripts,  while  the  other  two  have  different  subscripts  and  so  communications  are 
unavoidable.  The  calculation  can  be  performed  only  after  £"(i,y,A;  +  l)  and  E”(iJ  +  \,k)  are  moved  to  the  virtual 
processor's  memory  where  H^^'iiJ.k)  is  stored.  An  efficient  method  to  move  the  elements  of  arrays  is  to  use  the  circular 
shift  command,  CSHIFT,  an  intrinsic  function  in  CM  Fortran  (also  in  HPF),  as  shown  below. 


Ex_n_jal  =  CSHIFT  (Ex_n,  dim=2,  shift=l)  {4a) 
Ex_n_kal  =  CSHIFT  (Ex_n,  dim=3 ,  shift^l)  (4b) 
Ey_n_ial  =  CSHIFT  (Ey_n,  dim=l,  shift  =  l)  (4c) 
Ey_n_kal  ==  CSHIFT  (Ey_n,  dira=3 ,  shift  =  l)  (4d) 
Ez_n_ial  =  CSHIFT  {Ez_n,  dim=l,  shift  =  l)  (4e) 
Ez_n_jal  =  CSHIFT  (Ez_n,  dim=2,  shift  =  l)  (4f) 

Hx_nal  =  Hx_n  +  Rb*(Ey_n_kal  -  Ey_n  +  Ez_n  -  Ez_n_jal)  (5a) 
Hy„nal  =  Hy_n  +  Rb* (Ez_n_ial  -  Ez_n  +  Ex_n  -  Ex_n_kal)  (5b) 
Hz_nal  =  Hz„n  +  Rb*  (Ex_n_jal  -  Ex_n  +  Ey_n  -  Ey_n_ial)  (5c) 

Hx_nal_jsl  =  CSHIFT  (Hx_nal,  dira=2,  shift=-l)  (6a) 
Hx_nal_ksl  =  CSHIFT  (Hx_nal,  diin=3 ,  shift:=-l)  (6b) 
Hy_nal_isl  =  CSHIFT  (Hy_nal,  dim=l,  shift=-l)  (6c) 
Hy„nal_ksl  =  CSHIFT  (Hy_nal,  dim=3 ,  shift=-l)  (6d) 
Hz_nal_isl  =  CSHIFT  (Hz„nal,  diin=l,  shift=-l)  (6e) 
H2_nal_jsl  =  CSHIFT  (Hz_nal,  dim:::2,  shift=-l)  (6f) 

Ex_nal  =  Ca*Ex_n  +  Cb* (Hz_nal  -  Hz_nal_jsl  +  Hy_nal_ksl  -  Hy„nal)  (7a) 
Ey_nal  =  Ca*Ey_n  +  Cb* {Hx_nal  -  Hx_nal_ksl  +  Hz_nal_isl  -  Hz_nal)  (7b) 
Ez_nal  =  Ca*Ez_n  +  Cb* (Hy_nal  -  Hy_nal_isl  +  Hx_nal_jsl  -  Hx_nal)  (7c) 


Step  4.  Perform  calculations.  After  shifting  the  arrays,  all  terms  in  (2)  will  be  located  in  the  same  virtual  processor's 
memory.  Thus  all  virtual  processors  will  perform  the  calculations  simultaneously.  The  code  for  this  step  is  shown  in 
equations  (5)  and  (7)  above,  where  Rb,  Ca  and  Cb  are  as  given  in  equation  (3). 

Parallelizing  Mur’s  ABCs 

Only  the  implementation  of  Mur's  ABC  for  the  £,  field  component  on  the  plane  x  =  0  is  shown  here.  Other  cases  can  be 
done  similarly.  Eliminating  the  half  integer  values  by  replacing  k-\j2,  k  +  1/2  and  k  +  3/2  with  k  — 1,  k  and  k  +  l 
respectively,  equations  (15)  and  (16)  in  [7],  the  first  and  second  order  Mur  ABCs,  become 

£;*'(0,i,it)=£:(l,y.k)  +  C,(£r'(l,y,k)-£;(0,;,k))^  (8) 
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(9) 


A- )  =  it) 

+ c,  (  ( I  ,y ,  A- )  +  £r'  (0,  y .  ;t )) + c,  (e;  (0,  y,  t )  +  e;  ( 1 ,  y,  t )) 

+C3(£;(0,y  +  i.jt)  +  £:(0.y-i,A:)  +  £:(i,y+i,/t)  +  £:(i.y-i.jt) 
+£:(0,y,t  +  ])+E;(0,y,A:  +  i)  +  £;(i,y,it  +  i)  +  £:(i,7,it  +  i)) 


where 


C,  =(c„5f-6)/(c,5r  +  6),  =  2(1  -  Co5r/8)  and  C,  =(c„6o7(25(c„5?  +  6)). 


(10) 


Mur's  second  order  ABC  can  not  be  applied  in  the  corner  regions  as  it  uses  information  from  points  tangential  to  the 
boundary.  In  corner  regions  Mur’s  first  order  ABC  is  used  instead. 

While  E.  is  a  three  dimensional  array  and  lies  across  the  nodes,  operations  in  (8,9)  involve  the  elements  of  E^  on  two 
surfaces  (/  =  Oand  /  =  1 )  only.  The  steps  involved  in  implementing  the  ABCs  (8,9)  are  now  described. 

Step  1.  Define  array.  There  are  14  different  terms  in  equations  (8)  and  (9);  we  define  each  term  as  a  separate  array.  As  in 
step  1  of  the  last  section,  let  nal,  n  and  nsl  represent  superscripts  n  +  1,  n  and  «  - 1  respectively.  Also  let  0  represent 
/  =  0  and  1  represent  i  =  1 .  Let  j a  1,  j si,  kal  and  ksl  represent  subscripts  y  +  l,y-l,^4-l  and  <:-!  respectively.  The 
14  terms  are  defined  as 


REAL,  ARRAY  { 0 : Imax , 0 : Kmax ) : : 

&  Ez_nal_0,  Ez_nal_l,  Ez_n_0,  Ez_n_l,  Ez_nsl_0,  Ez_nsl_l, 

&  Ez_n_0_jsl,  Ez_n_0_jal,  Ez_n_0_ksl,  Ez_n_0_kal, 

&  Ez_n_l_jsl,  Ez_n_l_jal,  Ez_n_l_ksl,  Ez_n_l_kal 

Step  2.  Find  the  field  values  at  times  n  and  n-1.  This  is  similar  to  step  2  in  last  section. 

Ez_nsl_0  =  Ez_n_0 
Ez_nsl_l  =  Ez_n_l 
Ez_n_0  =  Ez__nal_0 
Ez_n_l  =  Ez_nal„l 


Step  3.  Move  the  data.  There  are  8  terms  in  (9)  whose  subscripts  are  not  ( 0 ,  j  ,  k)  or  ( 1 ,  j  ,  k) .  As  in  step  3  of  the  last 
section,  they  are  shifted  using  CSHIFT. 


Ez_n_0_jsl 

= 

CSHIFT 

( Ez_n_0 , 

diin=l 

Ez__n„0_jal 

= 

CSHIFT 

( Ez_n_0 , 

diin=l 

Ez_n_0_ksl 

CSHIFT 

( Ez_n_0 , 

dim=2 

Ez_n_0_kal 

= 

CSHIFT 

(Ez_n_,0, 

dim=2 

Ez_n_l_jsl 

= 

CSHIFT 

(Ez_n_l, 

dim=  1 

Ez_n_l_jal 

= 

CSHIFT 

{Ez_n_l , 

dim=l 

Ez_n_l_ksl 

= 

CSHIFT 

( Ez_n_l , 

dim=2 

Ez_n_l_kal 

= 

CSHIFT 

( Ez_n_l , 

dim=2 

shif t=-l ) 
shift=  1) 
shi Et=- 1 ) 
shift=  1) 
shift=-l) 
shift=  1) 
shif t=-l ) 
shift=  1) 


Step  4.  Extract  Ez„nal  on  the  surface  i  =  l  and  pass  them  to  Ez_nal_l,  that  is 


Ez_nal_l  =  Ez_nal  (1,  :  ,  :  )  (1!) 

This  is  efficiently  achieved  by  using  a  communication  compiler  in  CMSSL  library  as  below 

call  comm_get {Ez_nal„l , trace_extract , Ez_nal ,  ier )  (12) 

Detailed  explanation  of  coiTini_get  (and  the  associated  comrn_setup  and  coiran_send)  may  be  found  in  the  handbook 
for  CMSSL  of  CM  Fortran,  [5]. 

Step  5.  Apply  Mur's  first  order  ABC  using  the  following  operation.  This  is  for  obtaining  the  boundary  values  of  corner 
regions.  Note  that  this  operation  is  performed  on  the  whole  surface  i=0.  The  boundary  values  of  the  inner  region  are 
overwritten  by  the  second  order  Mur’s  ABC  in  a  subsequent  operation  (see  step  6  of  this  .section). 

Ez_nal_0  =  Ez_n_l  +  Cl*(Ez_nal_l  -  Ez_n_0) 
where  Cl  is  as  shown  in  (10). 
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step  6.  Apply  Mur's  second  order  ABC  for  the  inner  region  using  the  following  operations.  Here  Cl,  C2  and  C3  are  as 
shown  in  ( 10).  Also  Jmsl= Jmax- 1  and  Kins2  =Kinax-2 , 


Ez_nal_0(l: Jmsl,  l;Kms2, 
&  +  Cl  *  (  Ez_nal_l (1; Jmsl,  l:Kms2, 
&  +  C2  *  (  Ez_n_0  (1 :  Jmsl,  l:Kins2, 
&  +  C3  *  { Ez_n_0_j al ( 1 : Jmsl ,  l:Kms2, 
&  +  Ez_n_l„jal ( 1 ; Jmsl ,  l:Kms2, 
&  +  Ez_n_0_kal < 1 ; Jmsl ,  l:Kms2, 
Sc  +  Ez_n_l_kal  (1  ;  Jmsl,  l:Kms2, 


)  = 

-  Ez. 

_nsl_l 

l:Jmsl, 

1  :Kms2, 

)  + 

Ez 

nsl  0 

IrJmsl, 

l;Kitis2, 

)  + 

Ez_n_l ( 1 : Jmsl , 

l:Kms2, 

)  + 

Ez_n 

0  isl{l:Jmsl, 

1  :Kms2, 

)  + 

Ez_n. 

1 : Jmsl, 

l:Kms2, 

)  + 

Ez_n 

0  ksl 

1 : Jmsl , 

l:Kms2, 

)  + 

Ez_n 

_l_ksl 

1 : Jmsl , 

l:Kms2, 

Step  7.  Replace  Ez_nal  (  0 ,  :  ,  :  )  by  Ez_nal_0,  i.e.  do  the  operation 
Ez_nal  {0,  :  ,  :  )  =  Ez„nal_0 

This  is  achieved  efficiently  by  using  a  communication  compiler  from  the  CMSSL  library.  That  is 

call  comm_send (Ez_nal, trace_replace, Ez_nal_0, ier)  (14) 

The  extraction  and  replacement  operations  in  step  4  and  step  7  of  this  section  can  be  performed  by  directly  incorporating 
equations  (1 1)  and  (13)  into  the  code.  Such  an  approach  is  less  efficient  than  using  the  communication  compiler  instructions 
of  equations  (12)  and  (14),  this  strategy  being  about  four  times  faster.  (Even  so,  steps  4  and  7  still  use  the  majority  of  the 
computer  time.  If  the  total  time  from  step  2  to  step  7  is  1  second,  then  step  4  will  use  about  0.64  second  and  step  7  will  use 
0.23  second.) 

Another  technique  used  in  our  program  is  to  obtain  Ez_nal  { 0 ,  :  ,  ;  )  and  Ez_nal  (Imax,  :  ,  :  )  simultaneously.  This 
can  be  done  by  slightly  changing  the  above  steps. 


Parallelizing  Near  Zone  to  Far  Zone  Transformation 

The  far  zone  transformation  method  used  here  follows  from  [1,8].  Let  5'  be  a  closed  surface  which  is  wholly  within  the 
FDTD  space  domain  and  encloses  all  antennas  and  scatters  (if  any).  As  Cartesian  coordinates  are  used  here,  S  is  a 
rectangular  box.  By  integrating  the  contributions  from  the  tangential  electromagnetic  fields  on  all  cell  surfaces  on  S ,  the  far 
zone  electrical  field  and  can  be  obtained  as. 


11 

1 

1 

c: 

£,  =  -TlH;  +  (/e 

(15) 

where  t]  is  the  impedance  of  free  space,  Wg  ^  and  are  found 

from  [6]  as 

W(7,  / )  =  — ^^  j  j  J ,  (t  +  (r  •  r  )/c  -  r/c 

4  Tire  of  [v-  J 

U  ( r  ,/)  =  —!— |- 1  j  M ,( r -t- {  f '■  r )  A  '  r/ c  ' 1 
4nrc  ot  J 

(16) 

and  X  H , 

=-/jxE. 

(17) 

In  this  paper,  we  only  consider  the  contribution  to  the  far  zone  field  from  on  the  face  of  5'  which  has  an  outward  unit 
normal  vector  y  .  We  denote  this  potion  of  5'  by  the  subscript  yh.  The  contributions  from  other  tangential  fields  on  this 
surface,  as  well  as  tangential  fields  on  other  five  surfaces,  can  be  obtained  in  a  similar  way. 

A  sinusoidal  excitation  is  used  for  the  radiation  pattern  calculation.  The  source  is  switched  on  at  f  =  0(ln  the  code  at  n=0  ; 
n  is  the  time  step).  After  computing  for  n  =  Nmin  time  steps  (Nmin  should  be  large  enough  to  ensure  that  the  steady  state 
has  been  reached),  the  tangential  electromagnetic  fields  at  the  centre  of  each  cell  surface  on  the  surface  S'  are  calculated  and 
stored  in  the  memory  until  n  =  Nmax  is  reached.  (Nmax  should  be  large  enough  to  let  the  relation  (24)  hold,  see  below). 
The  tangential  ETields  at  the  centre  of  a  cell  surface  are  obtained  by  averaging  the  two  E-fields  adjacent  to  the  cell  surface 
centre.  The  tangential  H-fieids  are  obtained  by  averaging  the  four  H-fields  adjacent  to  the  cell  surface  centre.  The  process 
of  obtaining  tangential  E-  and  H-fields  is  similar  to  obtaining  boundary  values  by  Mur’s  ABCs.  £,  at  cell  surface  centres  on 
the  face  5^,  is  obtained  in  this  process.  The  field  components  are  three  dimensional  arrays  with  the  third  axis  being  time  and 
may  be  expressed  as,  for  example,  E,^Ji,k,n).  After  and  the  other  tangential  field  components  on  S'  are  obtained 
between  time  steps  Nmin  and  Nmax  and  stored  in  the  memory,  they  are  used  to  calculate  the  far  zone  field. 

As  seen  in  [8],  on  5'^  contributes  to  U,  only.  Averaging  the  left  hand  side  of  two  equations  (13)-(14)  in  [8],  we  may 
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find  the  contribution  of  on  a  single  cell  surface  (denoted  by  subscripts  i.  k)  to  U.  as 


where  m  =  NINTim,,+{r' ■r)j(cdt))  (19) 

m„  =MOT((r-r/c)/5t)  (20) 

r  -  sin0cos(})x  +  sin9sin(j)y +  cos0f  (21) 

r'  =  (i  +  i/2)5.ct  +  ^5vv  +  (/:  +  l/2)5zz  (22) 


Here  the  operator  N/NT  is  as  for  the  FORTRAN  generic  function  which  rounds  a  real  number  to  the  nearest  whole  number. 
Note  that,  r  does  not  appear  in  (18)  as  it  can  be  chosen  arbitrarily  as  only  relative  values  of  far  fields  are  required.  Using 
(2i)-(22)  in  (19)  leads  to 

m  -  A'/Ar(m()  +((/  +  I  /2)sin0cosb  +  sin0sin(J)  +  (^  +  1  /  2)cos0)/(5c5r))  (23) 

We  can  see  that  m  can  be  constructed  as  a  four  dimensional  array,  expressed  M  ( i  ,  k ,  0,  (j)) .  Again  note  that,  we  don't  have 

to  know  the  values  of  /  and  r  in  (20).  Instead,  we  may  just  choose  properly  ensuring  that  the  relation 

Nmin  <  m<  Nmax  (24) 

holds  for  all  six  faces  of  S'. 

By  summing  ^  over  i  and  k  and  replacing  r  with  its  9  and  tj)  components  in  (18),  U,  due  to  on  is 

The  main  .steps  needed  to  implement  this  equation  in  CM  Fortran  are  shown  below.  For  the  sake  of  improved  readability  the 
symbols  0,  <))  and  5  are  retained  in  the  code  list  although  they  are  illegal  in  CM  Fortran. 

Step  1.  Define  arrays.  In  the  following  definitions,  mi  and  ma  represent  the  lower  and  upper  bound  respectively  for  relevant 
variables  except  for  Nmin  and  Nmax.  The  variables  Exyh_c shift  and  Temp  are  explained  in  steps  3  and  5  below. 

real,  array ( Imi ; Ima, Kmi : kma , Nmin : Nmax) : : Exyh, Exyh_c shift 
integer ,  array  ( Imi  :  Ima ,  Kini  :  kma ,  0mi  :  0ma  ,  <()mi  :  (fima )  :  :  M 
real,  array  { Imi  :  Ima ,  Kmi  :  kma ,  0mi  :  9ma ,  ())mi  :  (jima )  :  :Temp 
real ,  array  ( 0mi  :  0ma ,  bmi  : (j)ma )  :  :  Uz 

Step  2.  Use  compiler  directive  to  control  the  layout  of  arrays,  i.e.  control  how  the  arrays  are  placed  in  the  distributed 
memory.  It  will  become  clear  in  step  5  why  the  third  and  fourth  axes  should  be  distributed  on  a  single  node  rather  than 
across  the  nodes.  (Compiler  directive  must  start  from  column  I.) 

CMF$  LAYOUT  Exyh { : news ,  mews, : serial) 

CMF$  LAYOUT  Exyh_cshi f t (: news ,: news serial ) 

CMF$  LAYOUT  M (: news news ,: serial , serial ) 

CMF$  LAYOUT  Temp (: news news ,; serial , serial ) 

Step  3.  For  all  i,k  and  m,  perform  the  operation  Exyh(i,k,m  +  1)-  Exyh{i,k,m-1)  and  assign  the  results  to  array 
Exyh_cshift.  That  is 

Exyh_cshift  =  CSHIFT {  Exyh,  3,1)-  CSHIFT(  Exyh,  3,  -1  ) 

Step  4.  Obtain  the  array  M.  In  this  step,  the  'distances'  from  far  field  point  (O,!}))  to  cell  surface  centres  on  face  S'^  are 
calculated  in  parallel. 

fora  11  (  i=Imi  :  Ima,  k^^Kmi  :  Kma,  0=0mi  :  0ma,  4>-(t>mi  :  (|)ma) 

&  M{i,k,0,(l))=NINT(MO  +  {  (i  +  0.5)*sin(9)*cos{4)) 

&  +  Jyh*sin  (0)  *sin  ((t>)  +  (k+0 . 5)  *cos  ((|))  )  /  (6c6t)  ) 
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step  5.  Sum  U. ^  over  i  and  k.  In  this  step,  a  gather  operation  along  the  3rd  axis  of  Exyh_cshi  f  t  is  performed.  The  CM 
Fortran  utility  library  has  an  efficient  gather/scatter  algorithm  along  a  serial  axis,  which  is  why  the  third  and  fourth  axes 
must  be  declared  :  serial  in  step  2. 

f  oral  1  (  i  =  Imi  :  Ima ,  k=Kini  :  Kma ,  0-emi  :  0ma ,  :  (fma ) 

&  Temp  (i,k,0,4>)=Exyh_cshift{i,k,M(i,k,  0,(1))  ) 

Uz  =  A*SUM (SUM (Temp, DIM=2 ), DIM-1 ) 


where  A  =  (6.r5z)/(8Tic5?) 


Results 


The  parallel  FDTD  code  that  has  been  described  was  applied  to  the  problem  of  references  [9,10],  that  is,  a  UA  monopole  on 
a  conducting  box  in  the  presence  of  a  simple  hand-head  model.  Thr  results  obtained  were  similar  to  the  reported  resu  t^s. 
Table  1  shows  the  comparison  of  CPU  time  for  a  serial  code  run  on  a  SUN  SPARC-2  workstation  and  CM  busy  time  for  the 
parallel  code  run  on  a  CM-5  for  the  core  FDTD  algorithm.  The  parallel  code  is  about  60  times  faster  when  16  processors 
are  used  compared  with  the  serial  code,  and  about  100  times  faster  when  32  processors  are  used. 

Far  field  patterns  on  two  planes  normal  to  each  other  were  also  calculated  (as  in  [9]  and  [10]).  In  our  calculations,  the  angle 
increment  for  far  field  points  is  three  degrees,  and  the  integration  surfaces  S'  for  far  zone  transformation  2  ceUs  distant 
from  the  objects  being  studied,  in  all  directions.  Compared  with  computer  times  in  [9]  and  [  0]  for  far  field  pattern 
calculations,  the  parallel  code  is  approximately  16  times  faster  for  16  processors  and  approximately  27  times  faster  for 
processors.  Note  that  this  time  comparison  may  be  misleading  due  to;  1)  different  ABCs  are  used,  2)  some  parameters, 
such  as  the  angle  increment  for  far  field  points,  are  not  stated  in  the  references,  3)  different  sources  are  used  and  4)  the 
method  used  to  obtain  the  far  field  patterns  is  not  explicitly  described  in  the  references.  Even  so  the  time  comparison  is 
offered  as  an  indicator  of  code  performance. 

The  parallel  code  was  also  used  to  obtain  the  radiation  pattern  of  a  m  dipole.  Table  2  shows  CM  busy 
parts  of  the  program  for  these  calculations.  The  integration  surface  5'  for  far  zone  transformation  is  8  ce  Is  inside  the  FDTD 
space  boundary  in  all  directions.  Radiation  patterns  are  calculated  on  two  normal  planes,  and  the  angle  increment  for  tar 
field  points  is  three  degrees.  Tlie  FDTD  space  size  has  been  carefully  chosen  to  maximise  efficiency. 

By  comparing  the  results  in  this  table  and  the  results  in  [9]  and  [10]  it  can  be  seen  that  the  Percentage  of  ‘he 
used  by  different  parts  of  the  program  in  serial  case  and  in  parallel  case  are  significantly  different.  While  FDTD  a  gonthm 
takes  57%  and  81%  of  the  CPU  time  in  [9]  and  [10]  respectively,  it  takes  only  16%  of  computer  time  for  the  parallel  code. 
For  ABCs,  which  take  only  3.87o  and  5.5%  CPU  time  in  [9]  and  [10]  respectively  (note.  Mur's  ABCs  are  not  used  in 
[9,10]),  Mur's  ABCs  take  about  30-35%  computer  time  in  the  parallel  code. 


Table  1  The  comparison  of  CPU  time  on  SUN  SPARC-2 
and  CM  busy  time  on  CM-5  for  FDTD  algorithm  for  a  monopole  on  a  conducting 
box  in  the  presence  of  head  and  hand  model,  (time  steps  1500.) _ 


FDTD 
space  size 

SUN 

SPARC-2 

CM-5 

16  Processors(64  VUs) 

CM-5 

32  Processors(128  VUs) 

55x46x56 

3059  s[7] 

47  s 

28  s 

96x86x102 

11478  s[8] 

194  s 

113  s 

Table  2.  CM  busy  time  for  different  parts  of  the  program  for  a  Xll  dipole  radiation 
pattern  calculations,  f  =  1.875  GHz,  cell  size  =  5mm3,  time  steps  =  1000. 


CM-5  configuration 

16  Processors(64  VUs) 

32  Processors(128  VUs) 

FDTD  space  size 

64x80x48 

64x144x48 

64x80x80 

64x144x80 

Total  CM  busy  time 

237  s 

408  s 

209  s 

345  s 

FDTD  algorithm 

15.49% 

15.33  % 

15.68  % 

16.60% 

Mur's  ABCs 

34.71  % 

31.06% 

34.81  % 

29.50  % 

Equivalent  currents 

9.35  % 

15.54  % 

10.81% 

18.03  % 

Near  to  far  field 

37.45  % 

34.99  % 

35.77  % 

32.70  % 

Others 

2.99  % 

3.08  % 

2.92  % 

3.17% 

Total 

100% 

100% 

100% 

100% 

Conclusion 

In  this  paper,  methods  of  parallelizing  FDTD  algorithm,  Mur's  ABCs,  and  near  zone  to  far  zone  transformation  are 
discussed.  On  a  32  processor  CM-5,  the  core  FDTD  algorithm  is  100  times  faster  than  an  existing  serial  code  run  on  a  SUN 
SPARC-2  workstation,  and  the  calculation  of  the  radiation  patterns  on  two  planes  normal  to  each  other  is  approximately  27 
times  faster.  The  code  is  transportable  in  that  it  can  be  run  unmodified  on  other  CM-5  machines  irrespective  of  the  number 
of  processors. 
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Calculation  of  Electromagnetic  Fields  with  the 
Multiple  Multipole  Method  (MMP  Method)  on  Parallel  Computers 

C.  Tudziers,  H.  Singer 
Technical  University  Hamburg-Harburg 


Abstract 

The  MMP  Method  extended  by  line  sources  is  well  suited  for  the  analysis  of  electromag¬ 
netic  field  problems  in  electrodynamics.  To  handle  large-scale  problems,  an  existing  MMP 
program  package  was  modified  for  implementation  on  a  parallel  computer  (MIMD  system 
with  distributed  memory).  The  program  package  has  been  divided  into  three  parts  which 
have  been  parallelized  individually.  Tests  have  been  performed  on  several  systems  including 
a  machine  containing  128  processors,  with  satisfactory  results. 

Introduction 

The  MMP  Method  is  used  for  the  numerical  computation  of  3D  electromagnetic  fields.  Large  scale 
problems  involve  long  computation  times  and  large  equation  systems.  Parallel  computers  are  designed 
to  handle  such  problems.  As  shown  later,  the  MMP  code  can  be  divided  into  three  nearly  independent 
parts.  An  effective  parallelization  requires  parallel  algorithms  for  all  of  them.  We  developed  such 
algorithms  in  order  to  implement  them  in  an  existing  MMP  code  which  has  already  been  successfully 
run  on  sequential  computer  systems.  In  this  paper  we  present  these  algorithms  and  show  the  results  of 
some  tests  performed  on  different  parallel  systems. 


The  MMP  Method 

The  MMP  Method  is  based  on  the  Helmholtz  equations  for  time  harmonic  fields. 

AH  +  k^H  =  0,  AE  +  k^E  =  0  with  k  =  o)V^- 

They  can  be  solved  by  Debye  potentials  and  separation  in  spherical  coordinates  The  results  are 

series  which  include  spherical  Bessel  Hankel  h^^^^  and  Legendre  functions  Yf,„: 

E  =  rxgradv  +  -^rot(r  x gradu) , 

j<BS 

H  = - —  rot(T  X  grad  v)  +  r  x  grad  u , 

jwp 

u(r,0,M/)  =  |;  t[a,„,-hf>(kr)  +  b,„,-jr(kr)]-Y,.(e,M/)  , 

r=l  m=-( 

v(r,0,v|/)  =  X  ' 

(=\  n>=-t 
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For  b^„=d^,„=0  these  equations  represent  the  fields  of  radiating  dipoles  (^-1),  quadrupoles  (^-2)  and 
higher-order  multipoles.  In  numerical  calculations  the  electromagnetic  field  is  divided  into  a  known  in¬ 
cident  part  <£<’), H(')>  and  the  scattered  field 


The  latter  is  approximated  by  several  expansions  of  the  described  form. 

An  important  extension  of  the  MMP  method  consists  in  joining  a  large  number  of  aligned  multipole  ex¬ 
pansions  to  one  segment.  A  source  density  is  defined  which  is  modulated  by  certain  basis  functions 
along  the  segment  length.  The  electromagnetic  field  of  such  segments  is  calculated  by  numerical  in¬ 
tegration.  This  procedure  can  reduce  the  total  number  of  unknown  coefficients  and  thereby  save  stor¬ 
age,  This  technique  is  fully  described  in  [1]  for  straight  lines.  It  can  also  be  applied  to  rings  or  arbitrary 
curves  [2]. 

To  determine  the  unknown  coefficients,  the  boundary  conditions  of  the  electromagnetic  field  at  sur¬ 
faces  have  to  be  considered  (Fig  1), 


Fig.  1 :  Boundary  conditions 


t(E^''>-E(’^)  =  0, 

n.(^(i')fi(ii)_^(i)Hn))  =  o 


For  Kii  CO  : 

t-E^"^  =0, 
n-H^”^  -0. 


where  -  j - ) . 


CO- Go 


Usually,  these  equations  are  evaluated  successively  at  a  number  of  surface  points  in  order  to  obtain  a 
system  of  linear  independent  equations  for  the  unknown  sources  in  the  following  form; 


sources 


'a,!  . 

■  •  ain 

qi 

■b.‘ 

surface  points  >1 

Sml 

•  ■  2fnn 

bm. 

o  A-q  =  b 


The  elements  of  the  matrix  A  hold  the  frequency-dependent  geometrical  and  material  data  of  the  con¬ 
figuration,  while  the  vector  b  contains  the  information  on  the  known  incident  field,  q  is  the  vector  of 
the  unknown  sources.  In  the  case  of  m=n  (point  matching)  this  system  has  a  unique  solution.  But  the 
solution  vector  obtained  in  this  way  often  yields  large  deviations  of  the  boundary  conditions  outside  the 
matching  points,  Better  results  are  achieved  if  the  number  of  surface  points  is  increased  (m  >  2n)  and 
the  overdetermined  equation  system  is  evaluated  [3],  The  sought  solution  must  minimise  the  square 
product  of  the  residuum  r  ; 
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r**r  ->  minimum  with  r  =  Aq-b. 

Therefore  the  following  equation  system  must  be  solved  [4]: 

=>  A”-A-q- A"  b  =  0  o  Z  q=y 

where  A”  is  the  Hermitian  transpose  of  A.  This  so-called  Least  Squares  Method  (LSM)  implies  a 
matrix  multiplication  of  A  before  the  quadratic  matrix  Z  can  be  decomposed  in  order  to  determine  the 
solution  vector  q.  Usually,  the  QR-Decomposition  (Householder)  is  applied  to  the  overdetermined 

equation  system  in  such  problems.  The  first  way,  however,  allows  the  required  amount  of  memory  to 
be  reduced,  as  shall  be  shown  later. 

After  this  stage,  the  electromagnetic  field  can  be  computed  at  arbitrary  space  points.  For  the  analysis  of 
the  result,  a  graphic  output  is  useful.  In  general,  this  implies  the  computation  of  the  field  at  a  large 
number  of  space  points. 

Consequently,  the  MMP  code  can  be  summed  up  by  the  three  most  time-consuming  parts: 

•  Computation  of  the  matrix  elements  in  order  to  compose  an  equation  system 

•  Solution  of  the  equation  system 

•  Field  computations 


The  Parallel  Computer 


All  parallel  computers  used  in  our  investigations  are  based  on  the  MIMD  type  (MIMD  =  Multiple  In¬ 
struction  stream,  Multiple  Data  stream.  Fig.  2).  This  means  that  every  processor  has  its  own  local 
memory.  The  data  transfer  must  be  performed  by  communication  (Message  Passing).  The  development 
of  the  parallel  algorithms  has  been  made  on  a  transputer  system  containing  128  transputers  (Parsytec 
SC  1 28)  with  4  MByte  of  local  memory.  Each  of  them  has  four  links  for  communication  with  a  transfer 
rate  of  1  MByte/s.  The  transputers  can  be  connected  via  the  links  to  different  types  of  networks  (e.g. 
rings,  trees  or  lattices).  The  second  parallel  system  mentioned  in  this  paper  is  a  cluster  of  six  HP735 
workstations  connected  via  FDDI  (100  MBit/s)  under  PVM. 


communication  system 


Fig.  2:  Model  of  a  MIMD  system 
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Parallelization 

In  order  to  achieve  an  effective  usage  of  the  parallel  systems,  the  three  parts  of  the  MMP  code  men¬ 
tioned  above  have  to  be  parallelized.  The  aim  is  to  solve  large-scale  problems  in  acceptable  computa¬ 
tion  times.  Such  problems  with  several  thousand  unknowns  lead  to  a  large  requirement  of  storage  for 
the  matrix.  Therefore  the  equation  system  has  to  be  distributed  among  the  local  memories  of  the  pro¬ 
cessors.  The  largest  computable  problem  size  is  thus  limited  by  the  sum  of  the  local  memories.  The 
parallel  algorithms  for  the  computation  of  the  matrix  elements  and  for  the  solution  of  the  equation  sys¬ 
tem  have  to  deal  with  this  kind  of  distribution. 


Several  examples  are  proposed  in  order  to  test  the  effectiveness  of  the  parallel  algorithms.  One  of  them 
is  a  series  of  perfectly  conducting  bars  in  a  plane- wave  field  depicted  in  Fig.  3.  This  configuration  was 
used  purely  to  examine  the  efficiency  of  the  parallel  algorithm.  The  tests  were  performed  on  the 
transputer  system.  Since  each  transputer  had  4  MByte  of  memory,  the  maximum  number  of  unknowns 
was  around  500  in  these  first  examples. 


point  multipoles 
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1=  xy{  2  ...  20) 


Fig.  3:  Test  configura- 
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Parallelization  of  the  matrix  computation 

Firstly,  the  algorithm  for  point  matching  is  regarded,  In  this  case,  there  is  principally  no  difference  if  the 
rows  or  columns  of  the  matrix  are  distributed  among  the  local  memories.  But  the  structure  of  the  al¬ 
ready  existing  sequential  program  makes  it  easier  to  choose  the  row-oriented  technique.  On  every 
processor  a  process  p^  is  started  which  is  able  to  compute  one  row  of  the  matrix  (Fig.  4). 


The  first  processor  holds  an  additional  process  pj^  which  is  responsible  for  the  connection  to  the  pe¬ 
riphery  and  the  load  balancing.  Initially,  this  process  sends  data  as  regards  geometry,  material  con¬ 
stants,  frequency  and  topology  to  each  processor.  Subsequently,  it  combines  a  certain  number  of  sur¬ 
face  points  to  packages  and  sends  them  successively  to  the  computation  processes  These  processes 
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calculate  the  corresponding  rows  of  the  equation  system  and  store  them  in  their  local  memories.  Af¬ 
terwards,  each  process  can  order  a  new  data  package  by  sending  a  signal  to  the  distribution  process.  At 
the  end,  after  all  surface  points  have  been  distributed,  each  processor  holds  a  certain  number  of  rows  of 
the  equation  system  in  its  local  memory.  In  general,  this  number  varies,  because  the  computation  times 
of  the  matrix  elements  are  different.  To  achieve  equal  loads  for  all  processors  in  the  following  matrix 
decomposition,  the  processors  with  too  many  rows  have  to  send  the  excess  rows  to  the  ones  with  less 
rows.  The  graph  in  Fig.  4  shows  the  behaviour  of  this  algorithm  at  several  processor  numbers  p  by 
means  of  two  simple  test  examples  with  102  and  498  unknowns.  The  efficiency  e  is  defined  as 

e  =  --  — -100%. 

P  Tp 

Tg  and  Tp  are  the  computation  times  of  the  sequential  and  parallel  program,  respectively.  The  curve 
decreases^ at  higher  processor  numbers,  because  the  central  distribution  process  is  not  able  to  provide 
the  data  packets  fast  enough  to  keep  the  processors  permanently  busy,  especially  at  lower  loads.  This 
can  be  avoided  by  increasing  the  package  size  or  by  adopting  an  alternative  load  balancing  strategy  dis¬ 
cussed  later. 


The  parallelization  of  the  least  squares  technique  is  more  complicated,  due  to  the  matrix  product: 


Z  =  A"-A  = 


“11  “In 


The  analysis  of  the  matrix  elements  zu  shows  that  they  can  be  written  as  the  sum  of  products 
containing  only  elements  of  one  row  of  A.  Thus,  it  is  not  necessary  to  store  the  whole  matrix  A.  After 
one  or  more  rows  have  been  computed  and  stored  in  a  buffer,  the  corresponding  products  can  be  calcu¬ 
lated  and  added  to  the  already  known  parts  of  the  elements  zy.  In  the  following  these  rows  are  no 
longer  needed  and  the  buffer  can  be  overwritten. 


The  data  structure  consuming  most  of  the  storage  is  the  matrix  Z.  Therefore  Z  has  to  be  distributed 
among  the  local  memories  of  the  processors.  On  the  other  hand,  the  computation  of  the  rows  of  A  is 
performed  by  the  different  processors.  Since  each  row  of  A  contains  portions  of  each  element  of  Z, 
these  portions  must  be  delivered  to  the  corresponding  processors  via  the  communication  system.  Fig  5 
shows  the  parallel  algorithm  schematically.  The  processes  p^  receive  packets  of  surface  points  and 
compute  the  corresponding  rows  of  A.  Afterwards,  they  calculate  the  products  of  the  row  elements  and 
send  them  to  the  right  processors.  The  processes  p7  receive  them  and  summarise  the  arriving  portions 
to  the  already  collected  parts  of  the  elements  zy.  Of  course,  the  portions  are  not  sent  element  by  ele¬ 
ment.  Prior  to  this,  a  buffer  is  filled  before  being  sent  as  a  whole  to  the  corresponding  processor. 


Fig.  5  also  shows  the  efficiency  obtained  with  the  described  algorithm  for  several  test  configurations. 
Both  curves  of  the  already  known  configuration-type  lie  below  the  curves  of  the  point  matching  algo¬ 
rithm  because  of  the  additional  communication  quantity.  The  third  curve  belongs  to  another  test  con¬ 
figuration  where  segment  sources  have  been  used,  which  entail  a  numerical  integration  for  the  corre¬ 
sponding  matrix  elements.  As  a  result,  the  computation  of  the  matrix  elements  takes  considerably  more 
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time  than  the  matrix  multiplication.  Due  to  this  fact,  the  ratio  of  communication  to  computation  time 
gets  better  and  the  efficiency  rises. 


Fig.  5:  Process  model  and  efficiency 


Parallelization  of  the  matrix  decomposition 

After  the  equation  system  has  been  established,  it  remains  in  the  local  memories  of  the  processors  in 
order  to  activate  the  solution  process  immediately.  In  the  case  of  the  point-matching  technique  Gauss¬ 
ian  elimination  with  partial  pivoting  is  employed.  The  matrix  obtained  by  the  LSM  is  symmetric  and 
Cholesky  decomposition  can  be  applied.  In  literature  a  lot  of  algorithms  are  suggested  for  the  parallel 
solution  of  equation  systems  considering  different  methods  and  distribution  schemes.  Therefore  we 
refer  at  this  place  to  the  work  of  other  scientists  [5,6]. 

Parallelization  of  the  field  computation 

Essentially,  the  parallelization  of  this  last  section  in  the  MMP  code  can  be  easily  performed  by  means  of 
the  so-called  farming  method.  Similar  to  the  point  matching  algorithm  explained  in  the  first  subsection, 
a  central  process  (master)  sends  out  data  packets  containing  space  points  to  computation  processes 
(worker)  placed  on  each  processor.  After  one  worker  process  has  determined  the  results  it  sends  them 
back  to  the  central  process.  The  master  must  receive  the  results,  do  some  further  work  (e.  g.  storing  to 
memory  or  disk,  analysing)  and  send  out  a  new  data  package  to  the  waiting  worker 

One  disadvantage  of  this  method  often  occurs  in  the  case  of  massively  parallel  systems  or  short  compu¬ 
tation  times.  Since  every  working  process  sends  its  results  to  the  central  master,  the  collection  of  the 
results  by  the  master  may  not  always  be  achieved  rapidly  enough.  Consequently,  an  interruption  in 
sending  out  new  data  packets  to  the  idle  workers  may  ensue.  Due  to  this  idle  time,  the  efficiency  of  the 
parallel  program  decreases  rapidly  in  the  case  of  higher  processor  numbers. 

This  behaviour  can  be  improved  by  several  approaches.  One  of  them  consists  in  sending  out  large 
packages  of  tasks  in  order  to  reduce  the  number  of  communications.  If  possible,  the  results  should  be 
stored  m  the  local  memories  of  the  processors  until  completion  of  task  distribution  and  be  sent  to  the 
master  all  together.  Since  the  matrix  of  the  equation  system  is  no  longer  needed,  almost  the  entire 
memory  is  available.  Thus,  the  same  behaviour  as  in  the  parallel  point  matching  can  be  observed. 

Another  improvement  often  successfully  applied,  is  to  store  a  certain  number  of  tasks  in  the  local 
memories.  After  the  result  of  the  actual  task  has  been  sent  to  the  master,  the  worker  process  can  begin 
to  compute  the  new  data  immediately.  New  jobs  received  from  the  master  are  appended  to  the  queue 
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Additional  tests 

In  order  to  test  the  parallel  program  in  the  case  of  higher  computation  amounts  and  higher  processor 
numbers,  two  further  examples  are  considered.  The  first  one  is  the  perfectly  conducting  object  shown 
in  Fig.  6a.  The  scattered  field  is  approximated  by  five  multipole  expansions  with  ^  =  8  inside  the  body, 
which  results  in  960  unknown  sources.  The  short  tube  in  Fig.  6b  was  computed  exclusively  with  ring 


d)  Computation  times  of  the  whole  program 


The  computation  times  related  to  the  one  with  eight  processors  have  been  plotted  in  Fig.  6d.  As  ex¬ 
pected  the  distance  to  the  ideal  curve  rises  with  increasing  processor  number  but  no  stoppage  can  be 
observed.  Thus,  even  with  large  processor  numbers  the  parallel  code  provides  a  sufficient  effect.  The 
difference  between  the  two  examples  is  again  explained  by  the  distinct  kind  of  source  used  in  both  con¬ 
figurations. 


As  mentioned  earlier,  the  parallel  MMP  code  has  also  been  run  on  a  workstation  cluster.  Our  experi¬ 
ence  shows  the  computation  power  of  each  workstation  to  correspond  to  32  transputers.  Since  these 
machines  have  been  permanently  used  by  other  people,  exact  measurements  were  impossible.  A  quali¬ 
tative  analysis  showed  results  similar  to  those  of  the  transputer  system. 


887 


Conclusions 


The  previous  sections  demonstrate  the  application  of  the  MMP  Method  on  parallel  computers.  The 
parallel  algorithms  developed  for  the  three  parts  of  the  MMP  code  fulfil  the  following  criteria. 

•  Scalability,  they  can  be  run  on  an  arbitrary  number  of  processors. 

•  They  are  effective  at  crucial  problem  sizes. 

•  They  are  applicable  to  different  types  of  parallelization  architectures  based  on  the  MIMD 
model. 
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Implementation  of  the  finite-difference  time-domain  method  on  parallel  computers. 

R.  S.  David  and  L.  T.  Wille,  Department  of  Physics,  Florida  Atlantic  University,  Boca  Raton,  FL  33431, 
USA. 

Abstract:  We  discuss  the  application  of  massively  parallel  computers  to  the  Finite-Difference  Time- 
Domain  (FDTD)  method  in  electromagnetics.  With  regard  to  this  technique  we  compare  and  contrast 
machines  based  on  the  Single  Instruction  Multiple  Data  (SIMD)  paradigm  to  those  employing  a  Multiple 
Instruction  Multiple  Data  (MIMD)  approach,  notably  distributed  systems.  Although  these  methodologies 
are  quite  different  they  both  yield  exceUent  performance  and  demonstrate  the  applicability  of  parallel  pro¬ 
cessing  to  FDTD  methods.  A  specific  application  to  the  FDTD  solution  of  the  scattering  of  a  plane  wave 
off  a  dielectric  sphere  is  implemented  on  the  MasPar  family  of  SIMD  computers.  Parallel  versus  sequen¬ 
tial  implementations  of  this  problem  are  compared.  While  sequential  programs  result  in  a  near  linear  in¬ 
crease  in  computation  time  as  the  problems  size  increases,  parallel  programs  exhibit  ^scontmuous 
plateau-like  jumps.  Scaling  studies  were  carried  out  by  increasing  the  problem  size  while  increasing  the 
number  of  processors  and  demonstrate  the  method’s  excellent  scalability  properties. 

1.  Introduction. 

Computational  electromagnetic  problems  are  increasingly  requiring  calculations  that  are  generally 
very  computer  intensive.  Large  amounts  of  computer  time  may  be  needed  to  reach  steady  state  or  to 
provide  for  adequate  spatial  and  temporal  resolution.  Parallel  computation  is  now  being  considered  by 
researchers  in  this  and  many  other  areas  to  obtain  results  more  quickly.  The  increased  speed  combined 
with  the  enhanced  memory  capacity  of  parallel  machines  provides  higher  accuracy  ^d  better  resolution 
than  would  be  possible  on  a  sequential  machine  for  the  same  amount  of  computer  time.  Three  main  ap¬ 
proaches  exist  for  the  computational  solution  of  electromagnetic  problems.  They  are  the  Method  of  Mo¬ 
ments  (MOM)  [1,2],  the  Finite-Difference  Time-Domain  (FDTD)  method  [1,3,4],  and  the  Fmite  Element 
Method  (FEM)  [1,3].  The  MOM  is  based  on  the  integral  formulation  of  Maxwell’s  equations  with  ap¬ 
propriate  bound^  conditions.  In  contrast,  the  FDTD  method  and  the  FEM  are  formulated  from  the  dif¬ 
ferential  form  of  Maxwell’s  equations.  These  approaches  have  also  been  combined,  resulting  m  hybnd 
formulations  [1,3].  Parallel  implementations  of  all  of  these  methods  have  been  developed  for  selected 
problem  instances  [5-10].  Recently  Varadarajan  and  Mittra  [10]  implemented  a  parallel  version  of  the 
FDTD  method  on  a  distributed  system.  These  authors  used  a  cluster  of  workstations  with  the  Parallel 
Virtual  Machine  (PVM)  networking  software  to  form  a  multiple-instruction  multiple-data  (MIMD)  envi¬ 
ronment.  In  contrast,  the  present  authors  [9]  implemented  the  FDTD  method  on  a  two-dimensional  mesh 
computer  (MasPar  MP-1  and  MP-2)  which  operates  in  a  single-instruction  multiple-data  (SIMD) 
setting.  Although  these  two  approaches  are  quite  dissimilar,  both  yield  very  efficient  performance  and,  in 
different  ways,  demonstrate  the  computational  improvements  available  from  parallel  processing. 

The  FDTD  method  is  particularly  well  suited  for  massively  parallel  implementation  because  it  has 
a  regular  grid  based  structure  with  field  updates  occurring  simultaneously  at  every  grid  point.  To  prepuce 
a  parallel  formulation  the  field  space  may  be  divided  into  disjoint  regions  (geometrical  decomposition) 
each  of  which  is  assigned  to  a  dedicated  processor  called  a  processing  element  (PE).  As  time  proceeds 
each  PE  affects  its  field  variables  using  information  from  neighboring  PE’s  according  to  the  FDTD 
equations.  Field  coordinates  are  updated  simultaneously,  i.  e.  in  parallel,  in  contrast  to  a  loop  over  the 
field  dimension  as  would  be  used  on  a  sequential  computer.  Excluding  the  boundary  region,  only  nearest 
neighbor  information  is  used  for  field  updates  which  implies  only  short-range  communications  between 
PE’s,  a  requirement  that  parallel  machines  are  characteristically  designed  for.  Parallel  machines  typically 
work  optimally  when  communication  between  distant  PE’s  is  kept  to  a  minimum  as  is  the  case  for 
nearest  neighbor  connections  used  in  the  FDTD  method.  For  an  absorbing  boundary  as  described  by 
Taflove  [11]  the  field  coordinates  along  the  boundary  require  information  from  nearest  and  next-nearest 
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neighbors  resulting  in  a  slight,  but  acceptable,  increase  in  computation  time.  Of  more  critical  importance, 
the  boundary  field  points  must  be  updated  separately  from  the  interior  field  points.  Thus  the  PE’s  as¬ 
signed  to  the  interior  grid  points  remain  idle  while  the  boundary  is  being  updated.  As  will  be  discussed 
below,  this  can  be  circumvented  by  redistributing  the  boundary  calculations  to  other  PE’s  -  a  form  of 
load  balancing.  However  the  overhead  associated  with  this  redistribution  can  prove  detrimental  to  any 
gains  made. 

For  the  PVM  distributed  system  implementation  advocated  by  Varadarajan  and  Mittra  [10]  the 
problem  domain  was  divided  into  N  regions  of  equal  size  by  partitioning  the  physical  space  along  the  z- 
direction.  This  was  mapped  to  the  processors  which  were  configured  in  a  linear  array  that  was  connected 
through  an  Ethernet  LAN  segment.  Thus  the  computational  task  was  divided  up  into  N  subtasks  with 
care  taken  to  minimize  idle  time  and  interprocessor  communication.  Here  N  was  taken  to  be  relatively 
small  (N  =  4-8).  Thus  the  system  operates  in  a  MIMD  fashion  because  different  instructions  may  be 
simultaneously  executed  on  different  PE’s  and  synchronization  is  not  specifically  enforced  at  each  clock 
cycle.  Each  PE  has  its  own  instruction  stream  that  operates  on  its  own  data.  A  MIMD  approach  works 
well  for  coarse  grained  problems  and  is  able  to  handle  a  higher  degree  of  irregularity  compared  to  SIMD 
processing.  Thus  it  is  well  suited  for  problems  that  generally  require  greater  processing  flexibility. 
However,  this  flexibility  of  MIMD  machines  can  also  be  a  hindrance  due  to  synchronization  issues. 
Typically  synchronization  must  be  forced  by  software  directives  which  for  systems  with  very  many  PE’s 
can  become  quite  a  formidable  task.  Also  the  different  subtasks  assigned  to  the  various  PE’s  implies  that 
different  PE’s  have  different  computational  burdens  at  a  particular  time.  To  minimize  the  time  caused  by 
processors  waiting  for  other  processors  to  finish  their  tasks  (idle  time),  great  care  must  be  taken  that  the 
problem  formulation  is  properly  load  balanced.  Data  management  for  MIMD  computers  either  uses 
private  memory  associated  with  each  PE  with  all  communication  and  synchronization  done  through 
message  passing  (multicomputers)  or  through  shared  memory  space  (multiprocessors).  Clusters  of 
workstations  (a  ‘farm’)  can  also  be  organized  into  a  MIMD  system  using  networking  software  such  as 
PVM  or  Linda  [12]. 

An  alternative  parallel  approach  for  solving  the  FDTD  problem  works  on  a  much  finer  scale  and 
assigns  a  PE  to  each  point  in  the  discretized  field  using  a  massively  parallel  computer  containing  several 
thousands  of  PE’s.  Being  of  SIMD  architecture,  the  massively  pardlel  computers  utilize  a  control  unit 
that  broadcasts  the  same  instruction  to  all  PE’s.  Different  data  specific  to  each  PE  and  therefore  each 
field  point  are  operated  on  by  the  same  instruction  simultaneously.  Typically  SIMD  machines  are  used 
for  problems  containing  fine  grain  parallelism  where  the  problem  space  is  for  the  most  part  regular.  By 
their  very  design  SEMD  machines  operate  synchronously  allowing  them  to  avoid  synchronization  prob¬ 
lems  but  with  the  cost  of  loosing  flexibility.  It  is  very  important  to  formulate  a  SIMD  based  problem 
such  that  the  data  is  mapped  in  a  regular  fashion  to  take  advantage  of  synchronization.  In  principle  it  is 
possible  for  a  shared  global  memory  to  be  accessed  by  PE’s  in  a  SIMD  machine.  This  is  impractical  to 
build  however  so  that  what  is  typically  available  are  machines  with  private  memory  associated  with  each 
PE.  Data  is  exchanged  between  PE’s  via  an  interconnection  network.  The  most  common  networks  used 
are  simple  two-  and  three-dimensional  meshes  as  well  as  hypcrcubes.  Using  embedding  theory  [13,  14]  a 
parallel  algorithm  implemented  on  one  type  of  network  may  be  ported  to  another  network,  although  there 
may  be  a  loss  of  efficiency  associated  with  such  a  mapping. 

This  paper  focuses  on  the  application  of  massively  parallel  computers  to  the  FDTD  method  with 
particular  emphasis  on  identifying  and  utilizing  the  parallel  structure  of  a  problem  to  maximize  computa¬ 
tional  efficiency.  As  an  example  a  computer  code  modeling  the  scattering  of  a  plane  wave  off  a  dielectric 
sphere  as  implemented  using  the  FDTD  method  is  parallelized.  Data  mapping  strategies  are  investigated 
to  conform  the  FDTD  method  to  the  two-dimensional  mesh  massively  paiWlel  computer  used.  Compari¬ 
son  of  computational  results  is  made  to  a  sequential  machine  to  provide  a  measure  for  parallel  computa¬ 
tion  usefulness  in  the  FDTD  method. 

2.  Parallel  Implementation. 

To  exploit  a  problem’s  parallelism  its  structure  must  be  carefully  investigated  to  determine  the 
extent  of  its  natural  parallelism.  This  is  not  necessarily  obvious  and  quite  often  is  hidden,  particularly 
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when  the  starting  point  is  an  algorithm  wniten  for  a  sequential  computer.  There  is  the  notion  of  trivial 
parallelization  which  involves  dividing  a  problem  into  separate  parts  that  do  not  need  to  know  anything 
about  the  other  parts.  The  separate  parts  are  solved  simultaneously  yet  independently.  An  electromagnetic 
example  of  this  would  be  the  calculation  of  results  over  a  range  of  frequencies,  angles  of  incidence, 
polarizations,  etc.  A  different  PE  is  assigned  to  each  computational  task  without  any  of  the  PE’s  needing 
to  communicate  information  to  any  other  PE.  This  approach  works  well  for  coarse  grained  problems 
where  the  number  of  PE’s  is  of  the  same  order  as  the  number  of  tasks  and  where  each  PE  has  enough 
memory  to  carry  out  its  job.  Many  problems  however  do  not  fall  in  this  category  and  have  a  different 
grade  of  parallelism  which  involves  communications  between  different  subtasks.  The  modeling  of  an 
electromagnetic  wave  propagating  through  a  region  is  an  example  of  this.  Here  the  problem  space  may  be 
divided  up  into  disjoint  regions  with  each  region  being  assigned  to  a  PE.  As  time  progresses  each  part  of 
space  as  assigned  to  a  specific  PE  is  updated  in  relation  to  all  other  regions.  Synchronous  communica¬ 
tion  between  PE’s  is  clearly  necessary  at  this  point.  What  the  communication  pathways  actually  are 
determines  the  extent  of  the  natural  parallelism.  If  these  pathways  propagate  throughout  the  problem 
space  in  a  replicated  fashion  then  the  problem  generally  has  a  high  degree  of  parallel  structure  and  excel¬ 
lent  speed-up  will  be  possible  on  a  parallel  computer.  If  on  the  other  hand  no  reoccurring  communication 
pathways  are  present  then  the  problem  possesses  little  parallel  structure  and  will  most  likely  not  benefit 
from  parallel  computation.  Most  problems  typically  lie  somewhere  in  between  these  extremes,  with  por¬ 
tions  of  the  problem  space  exhibiting  a  repeated  communication  pathway  network  while  other  parts  have 
no  connectivity  at  all.  Additionally  the  distance  between  nodes  in  the  algorithm  graph  plays  an  important 
role  in  determining  how  fast  the  parallel  computations  are  carried  out.  For  instance,  a  problem  may  have 
long  range  highly  regular  connectivities  but  this  can  not  be  exploited  if  the  computer  on  which  the  prob¬ 
lem  is  being  implemented  does  not  have  an  efficient  long  range  communication  capability. 

An  inspection  of  the  underlying  equations  shows  that  there  is  considerable  natural  parallelism 
built  into  the  structure  of  the  FDTD  method.  To  demonstrate  this  the  specific  example  of  a  plane  wave 
incident  on  a  lossless  dielectric  sphere  is  considered,  although  the  discussion  is  by  no  means  limited  to 
this  case.  The  FDTD  method  can  be  employed  to  determine  the  scattered  fields  and  the  results  may  1^ 
compared  to  the  exact  solution  to  this  problem  as  provided  by  Mie  [1].  Owing  to  its  regular  geometric 
form  this  problem  provides  clear  insight  into  how  the  FDTD  method  is  mapped  into  a  parallel  regime.  A 
numerically  solved  sequential  FDTD  solution  to  this  problem  is  provided  by  Sadiku  [1].  To  approximate 
the  spatial  and  temporal  derivatives  Yee’s  second-order  central-difference  approximations  were  used  [1, 
4, 15].  The  electromagnetic  fields  in  the  interior  region  are  given  by  six  FDTD  equations  that  are  all  quite 
similar.  For  example,  using  standard  notation  [1],  the  electric  field  component  in  the  x  direction  at  a  time 
n  and  a  field  point  (i,  j,  k)  is  given  by: 

Exn(i  j,k)  =  Exfi-lOJik)  +  Hz"(ij,k)  -  Hz^(i  j-l,k)  +  Hyn(ij4c-1)  -  Hy^OJ.k).  (1) 

Note  that  only  terms  from  nearest  neighbor  grid  points  arc  used  and  that  only  teniporal  information  from 
the  same  or  previous  time  step  is  needed.  The  other  equations  at  interior  grid  points  are  very  similar.  To 
terminate  the  physical  domain  absorbing  boundary  conditions  as  developed  by  Tafloye  et  al.  [11]  are 
typically  used.  These  come  in  several  types.  For  example,  the  electric  field  in  the  z-direction  at  time  n  and 
at  boundary  points  (i,0,k)  is  given  by: 

EzW4c)  =  Ez"-2(i.U).  .  (2) 

Thus,  this  boundary  condition  (and  others  similar  to  it)  only  involves  communication  of  nearest  neighbor 
magnitude,  although  it  does  invoke  temporal  information  two  time  steps  away.  Another  type  of  boundary 
condition  involves  two  or  three  components  and  contains  fields  at  points  two  lattice  jumps  away.  An 
example  of  such  a  condition  gives  the  x-component  of  the  magnetic  field  on  the  x  =  0  boundary  and 
takes  the  form: 

Hz"(0,j,k)  =  (H2"-2(1  JJc-l)  +  Hzn-2(1  j,k)  +  Hz^'^Cl  jJc-t-l))/3.  (3) 

Again  this  and  similar  boundary  conditions  involves  temporal  information  two  steps  away  in  adthuon  to 
field  variables  residing  on  PE’s  that  are  next-nearest  neighbors.  Thus,  just  like  the  mtenor  grid  pomts,  the 
boundary  conditions  all  involve  short-range  communications.  However  the  boundary  must  be  computed 
separately  from  the  interior,  an  intrinsically  serial  operation  that  could  impose  a  limitation  on  the  p^el 
nature  of  the  FDTD.  While  on  a  sequential  computer  the  time  spent  on  boundary  sites  is  negligible 
compared  to  that  consumed  in  calculating  fields  at  interior  grid  points  this  is  no  longer  true  on  a  parallel 
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machine  and  the  computation  of  the  boundary  information  may  well  become  the  time-determining  factor 
in  the  calculations.  Thus,  special  care  needs  to  be  taken  to  ensure  that  this  part  of  the  program  is 
performed  efficiently. 

Starting  from  a  sequential  FDTD  implementation  as  given  by  Sadiku  [1]  a  parallelized  program 
was  developed  in  two  parallel  computer  languages,  MPF  and  MPL,  both  of  which  run  on  the  MasPar 
class  of  massively  parallel  computers  [12,  16].  MPF  and  MPL  are  parallel  extensions  of  the  Fortran  90 
and  C  languages  respectively.  The  MasPar  MP-1  (DECmpp  12000)  parallel  computer  utilizes  1,024  - 
16,384  PE’s  in  a  two-dimensional  SIMD  mesh  architecture  with  toroidal  wrap-around  at  the  boundaries. 
Short-range  PE  communication  of  nearest  neighbor  magnitude  is  provided  by  a  fast  X-net  network  while 
for  long-range  PE  communication  a  somewhat  slower  global  router  is  used.  Each  PE  is  equipped  with 
forty  32-bit  registers  and  has  16  or  64  kByte  of  private  RAM  memory.  An  array  control  unit  (ACU)  is 
used  to  control  the  PE  array,  i.  e.  it  fetches,  decodes,  and  broadcasts  the  instructions  among  the  PE’s.  All 
parallel  operations  are  run  on  the  data  parallel  unit  (DPU)  which  is  collectively  made  up  of  the  com¬ 
munication  network,  the  PE  array,  and  the  ACU.  Seri^  computations,  data  I/O,  and  the  user  interface  are 
run  on  a  UNIX  frontend  workstation  (DECstation  5000).  MasPar  also  makes  the  MP-2,  a  computer  that 
is  faster  than  the  MP-1  but  still  binary  compatible  with  it.  All  computations  reported  in  this  paper  were 
performed  on  a  4,096-node  (64x64  PE-array)  MP-1  with  16  kBytes  of  memory.  The  MasPar  software 
provides  facilities  to  employ  only  a  1 ,024-node  (32x32)  or  2,048-node  (64x32)  portion  of  the  PE-array. 
Thus  results  will  be  presented  for  code  executed  on  the  1  k,  2  k,  and  4  k  processor  boards. 

MPF  and  MPL  provide  for  different  approaches  to  parallel  implementation.  The  former  is  a  ver¬ 
sion  of  Fortran  90  [17,  18]  with  extensions.  The  MPF  compiler  puts  variables  defined  according  to  the 
Fortran  77  standard  on  the  front-end,  while  those  adhering  to  the  Fortran  90  standard  are  placed  onto  the 
PE  array.  In  the  latter  case,  for  a  three-dimensional  spatial  array  as  used  in  the  FDTD  method,  unless 
special  mapping  directives  are  used,  the  first  dimension  is  mapped  along  the  x-direction  of  the  PE-array, 
the  second  dimension  along  the  y-direction,  with  further  dimensions  going  into  memory.  A  layering  into 
memory  is  used  if  the  actual  dimension  exceeds  that  of  the  processor  grid  in  the  corresponding  direction. 
The  compiler  assigns  operations  it  considers  parallel  to  reside  on  the  DPU.  All  other  operations  reside  on 
the  front-end.  Because  the  compiler  decides  what  is  parallel  and  what  is  not  the  programmer  must  be  very 
careful  to  structure  the  program  so  that  data  is  not  continually  traveling  between  the  DPU  and  the  front- 
end.  Called  ‘sloshing’,  if  it  occurs  this  effect  can  greatly  diminish  parallel  gains  made.  In  contrast,  MPL, 
based  on  Kemighan  and  Ritchie  C,  allows  for  specific  placement  of  data  on  and  manipulation  of  the  PE 
airay.  Variables  are  defined  as  either  singular  or  plural  with  the  singular  variables  residing  on  the  front- 
end  and  the  plural  variables  being  defined  on  every  PE  residing  on  the  DPU.  This  flexibility  provides  the 
programmer  with  an  explicit  way  to  avoid  sloshing  as  well  as  implementing  an  algorithm  requiring 
software  control  at  the  PE  level.  The  price  to  be  paid  for  this  is  that  the  amount  of  programming  effort 
and  careful  algorithm  design  is  much  greater  than  in  MPF.  Also,  the  ensuing  code  tends  to  be  not  very 
portable  since  it  is  designed  for  a  specific  configuration  of  the  parallel  machine.  For  example,  running  an 
MPF  code  on  a  different  partition  of  the  PE-array  can  be  done  by  a  simple  compiler  directive  in  MPF, 
while  a  more  thorough  re-write  may  be  necessary  in  MPL. 

The  MPF  implementation  of  the  problem  under  study  basically  involved  conforming  the  sequen¬ 
tial  FDTD  code  into  a  more  parallel  form,  i.  e.  rewriting  the  code  to  a  Fortran  90  standard.  Despite  the 
fact  that  the  FDTD  method  has  much  parallel  structure  the  actual  code  written  to  implement  it  on  a  se¬ 
quential  computer  was  by  definition  not  parallel.  For  the  compiler  to  properly  map  the  code  onto  the  PE 
array  careful  attention  was  used  to  eliminate  as  much  as  possible  any  remnants  of  sequential  code.  The 
field  coordinates  were  mapped  onto  the  two-dimensional  PE-array  using  the  default  compiler  assign¬ 
ments  along  the  x-  and  y-directions  with  the  z-direction  and  time  assigned  to  memory.  As  a  consequence 
computations  became  parallel  in  two  dimensions.  The  third  dimension  over  z  however  must  be  carried 
through  sequentially  with  an  iterative  loop  construct  for  each  PE.  Only  X-net  nearest  and  next-nearest 
neighbor  magnitude  computations  are  used  to  update  the  fields  both  in  the  interior  and  along  the  bound¬ 
ary.  For  the  interior  grid  points  N  sequential  operations  go  to  one  parallel  operation.  Along  the  boundary 
in  a  number  of  cases  only  one  edge  of  the  PE  grid  is  being  used  causing  the  interior  PE’s  to  remain  idle 
while  the  boundary  field  is  being  updated.  Data  redistribution  from  memoiy  to  the  PE  array  is  a  possible 
remedy  to  this  (as  will  be  discussed  below)  but  is  difficult  to  implement  using  MPF.  Specific  implemen- 
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tation  details  regarding  MPF  were  given  in  our  previous  paper  [9]  although  there  the  boundary  condi¬ 
tions  were  not  yet  parallelized,  a  shortcoming  that  has  now  been  circumvented. 

MPL  can  be  used  to  re-map  data  along  a  boundary  from  memory  to  the  PE  array  using  a  coot- 
dinate  ‘rotation’  (see  Fig.  1).  Due  to  its  ability  to  explicitly  place,  define,  and  change  data  on  the  PE  gnd 
a  coordinate  ‘rotation’  can  be  accomplished  by  taking  data  along  the  boundary  that  runs  into  memory, 
and  redistributing  it  amongst  all  PE’s.  Suppose,  for  example,  that  the  x  =  0  boundary  condition  given  by 
equation  (3)  needs  to  be  implemented.  If  the  data  is  stored  according  to  the  default  only  the  processors  m 
the  first  column  of  the  PE-array  will  be  active  and  a  loop  over  memory  locations  (z-direcuon)  needs  to  be 
performed  (left-hand  side  of  Fig.  1).  This  constitutes  a  very  inefficient  use  of  the  parallel  array  since  the 
vast  majority  of  processors  is  idle.  It  is  possible  to  perform  a  coordinate  ‘rotation’  and  to  map  the  x  =  0 
boundary  onto  a  temporary  two-dimensional  array  which  is  completely  stored  on  the  PE-array  (nght- 
hand  side  of  Fig.  1).  Now  these  boundary  sites  may  be  updated  in  a  single  cycle,  a  much  more  efficient 
use  of  the  processors.  However,  the  price  to  be  paid  for  this  paraUelism  is  that  data  needs  to  be  broadcast 
from  memory  on  processors  in  the  first  column  to  all  other  processors.  While  there  are  efficient  software 
directives  to  speed  up  this  broadcast,  this  is  still  a  time  consuming  step  and  it  may  outweigh  any  speed-up 
gained  from  having  all  processors  active.  Whether  or  not  the  coordinate  ‘rotation’  leads  to  a  gam  in 
speed  depends  on  the  speed  of  communication  of  the  parallel  machine  used  and  on  the  amount  of 
computation  involving  boundary  sites.  In  the  case  under  study  it  was  found  that  the  redistribution  led  to  a 
longer  execution  time  compart  to  a  code  in  which  no  such  ‘rotation’  was  performed.  However,  the 
stratagem  is  a  valuable  one  and  may  be  useful  in  more  complex  situations  or  on  parallel  computers  that 
have  a  faster  broadcast  mechanism  than  the  MasPar  MP-1.  In  particular,  it  might  be  very  well  suited  for 
hypercubes  since  a  one-to-all  broadcast  can  be  performed  very  efficiently  on  this  interconnection  network 
[13]. 


Fig.  1:  Rotation  from  memory  (mem)  to  two-dimensional  PE-array  in  order 
to  ensure  load  balancing  during  updates  of  boundary  sites  (see  text). 

The  MPL  code  without  boundary  rotation  turned  out  to  be  marginally  faster  than  the  correspond¬ 
ing  MPF  code,  which  is  hardly  surprising  considering  the  explicit  programmer  control  over  data  place¬ 
ment.  However,  the  gains  in  execution  time  were  only  on  the  order  of  10  %  and  hardly  justify  the  amount 
of  programmer  effort  involved.  They  illustrate  the  effectiveness  of  the  MPF  compiler  in  generating 
efficient  parallel  code  for  this  kind  of  problems.  In  the  remainder  of  this  paper  only  results  obtained  by 
the  MPF  code  will  be  shown,  since  the  general  trends  for  the  MPL  code  were  identical,  apart  from  its 
slightly  higher  speed. 
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3.  Results  and  discussion. 

For  the  model  problem  considered  here  calculations  were  performed  for  a  range  of  grid  sizes, 
with  the  total  number  of  grid  points  equal  to  Ixxlyxlj-,  where  Ix,  ly,  and  are  the  number  of  grid  points  in 
the  X-,  y-,  and  z-direction,  respectively.  Thus  all  field  variables  were  taken  to  be  matrices  of  the  form 
A(Ix,Iy,Iz).  To  investigate  scaling  properties,  in  all  calculations  ly  and  Iz  were  kept  fixed,  but  Ix  was  al¬ 
lowed  to  vary.  Not  surprisingly  the  execution  time  for  the  sequential  program  (executed  on  a  DECstation 
5000)  showed  a  perfect  linear  dependence  on  Ix-  In  stark  contrast,  Fig.  2  shows  the  execution  time  for 
500  time  steps  on  various  partitions  of  the  parallel  machine.  To  be  noted  is  that  this  time  remains  constant 
over  a  range  of  problem  sizes  and  exhibits  discontinuous  jumps  whenever  a  wrap-around  at  the  PE- 
boundaiy  occurs  leading  to  a  layering  in  memory.  The  various  symbols  indicate  that  computations  were 
performed  on  a  32x32  PE-array  (open  squares),  a  64x32  PE-array  (solid  circles),  and  a  64x64  PE-array 
(solid  squares).  On  the  32x32  array  the  jumps  occur  whenever  Ix  exceeds  a  multiple  of  32,  while  on  the 
other  arrays  this  happens  at  multiples  of  64,  commensurate  with  the  dimension  of  the  PE-array  in  the  x- 
direction.  The  ratio  between  the  execution  time  for  various  plateaus  is  always  very  close  to  an  integer, 
reflecting  that  the  calculations  are  essentially  completely  parallel.  It  is  to  be  noted  however  that  there  tends 
to  be  a  slight  increase  in  computation  time  near  the  middle  of  the  plateaus.  This  is  because  for  such  Ix 
values  the  communication  between  boundary  sites  is  long-range  and  thus  involves  a  router  caU.  On  the 
other  hand,  for  Ix  close  to  a  multiple  of  the  PE-dimension  in  the  x-direction  a  short-range  (and  faster)  X- 
net  call  will  be  sufficient. 
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Fig.  2:  Execution  time  (in  ms)  as  a  function  Fig.  3:  Speed-up  of  parallel  program  relative 

of  number  of  points  in  x-direction,  Ix,  for  to  sequential  program  on  DECstation  5000 

various  PE-anray  sizes.  as  a  function  of  problem  size,  Ix. 

A  useful  rneasure  of  parallel  program  performance  is  the  speed-up,  S.  This  is  defined  as  the  ratio 
of  the  execution  time,  Ti,  for  the  best  sequential  algorithm  running  on  a  single  processor  divided  by  the 
execution  time,  Tp,  for  the  parallel  algorithm  on  p  processors.  Efficient  parallel  programs  typically  show  a 
linear  increase  in  speed-up  as  the  number  of  processors  is  increased.  However,  if  the  problem  size  is 
fixed  speed-ups  tend  to  level  off  or  even  collapse  as  p  is  increased  since  it  may  not  be  possible  to  take 
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full  advantage  of  all  available  processors  or  the  communication  overhead  may  become  burdensome.  This 
is  a  consequence  of  a  very  important  result  in  parallel  computing  known  as  Amdahl’s  law  [13].  As  ob¬ 
served  by  Gustafson,  the  way  out  of  this  dilemma  is  to  increase  the  problem  size  as  the  number  of  pro¬ 
cessors  is  increased.  Algorithms  that  are  able  to  maintain  efficiency  under  this  operation  are  known  as 
‘scalable’  and  are  considered  optimal.  t  .4  • 

On  a  SIMD  system  it  may  not  be  possible  to  run  the  program  on  a  single  processor.  Instead  in 
the  definition  of  spe^-up  one  may  use  the  time  taken  by  the  sequential  program  on  another  computer  as 
the  reference  point.  In  the  present  work  we  have  taken  the  serial  time  Ti  to  be  that  needed  to  execute  the 
original  sequential  program  on  a  DECstation  5000.  The  ensuing  speed-up  (on  a  64x64  array)  as  a  func¬ 
tion  of  problem  size  is  shown  in  Fig.  3.  To  be  noted  is  that  the  speed-up  peaks  near  multiples  of  64 
(where  full  use  of  the  PE-array  is  made)  and  then  abruptly  drops  off  when  a  layering  in  memory  occurs 
(corresponding  to  the  jumps  between  plateaus  in  Fig.  2).  The  size  of  the  drops  diminishes  as  Ix  increases, 
with  speed-up  converging  to  a  peak-value  of  about  25.  Moreover,  by  taking  the  values  listed  in  Fig.  2  and 
plotting  scaled  speed-up  versus  number  of  processors  one  fmds  nearly  perfect  scalability  illustrating  that 
the  algorithm  has  essentially  been  completely  parallelized. 

These  results  form  a  satisfying  complement  to  the  work  by  Varadarajan  and  Mittra  [10]  on  a  very 
similar  problem  in  a  MIMD  environment.  These  authors  found  that  to  maintain  speed-up  at  an  optimal 
value  they  had  to  scale  the  problem  in  such  a  way  that  computation-to-communication  ratios  were  kept 
much  larger  than  unity.  Moreover,  in  addition  to  this  consideration,  problem  sizes  had  to  be  increased 
with  increasing  number  of  processors  to  maintain  efficiency.  Thus,  although  the  details  of  the  implemen¬ 
tation  differ,  the  essential  factors  necessary  to  guarantee  an  optimal  use  of  resources  are  the  same  in  then- 
work  as  in  the  present  one.  A  number  of  other  authors  (sw  [5]  and  references  therein)  have  also  quoted 
excellent  speed-ups  on  scaled  problems.  It  should  be  pointed  out  that  SIMD  architectures  may  be  less 
suited  for  more  complex  problems  involving  irregular  grids  or  adaptive  meshes.  In  those  cases  MIMD 
computers  hold  the  edge,  although  load  balancing  and  synchronization  may  be  cumbersome.  Also,  once 
the  decision  is  made  to  invest  development  effort  on  a  MIMD  platform  one  should  additionally  consider 
using  MOM  or  FEM  techniques  rather  than  exclusively  the  simpler  FDTD  method. 

In  summary,  the  FDTD  method  is  very  well  suited  for  implementation  on  parallel  computers,  be  it 
within  a  SIMD  or  a  MIMD  framework.  This  is  a  consequence  of  this  technique’s  regular  structure,  si¬ 
multaneous  updating,  and  short-range  communication.  The  main  loss  of  efficiency  is  due  to  the  bound^ 
conditions:  all  processors  containing  only  interior  grid  points  are  idle  while  boundary  sites  are  being 
updated.  On  a  two-dimensional  mesh  a  redistribution  of  the  load  may  be  accomplished  through  a  rotation 
from  PE-memory  to  the  PE-array,  but  this  in  itself  incurs  considerable  overhead  which  may  offset  the 
gains  obtained  by  having  all  processors  active.  Nevertheless,  in  more  complex  situations  such  a  step  is 
justifiable  in  order  to  attain  complete  parallelism  in  the  calculations.  The  resulting  code  shows  excellent 
scalability  demonstrating  a  perfect  match  between  problem  formulation  and  parallel  architecture.  The 
speed-up  that  can  be  attained  over  a  single  processor  workstation  is  already  considerable.  Thus,  it  is  only 
to  be  expected  that  further  improvements  will  be  possible  as  faster  parallel  computers  with  more 
processors  enter  the  market  place. 
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Introduction 

Cancer  has  been  a  leading  cause  of  death  in  the  United  States  for  several  decades  and  remains 
so  today;  therefore,  it  is  a  leading  topic  of  research.  Current  researchers  employ  several  methods  to 
destroy  and  limit  the  growth  of  cancerous  tissue.  However,  all  methods  contain  a  similar 
characteristic;  they  destroy  the  healthy  tissue  as  well  as  the  tumor.  One  promising  option  in  the 
treatment  of  cancer,  involves  concentration  of  microwave  energy  at  the  tumor  site.  For  this, 
electromagnetic  waves  are  launched  into  the  tissue  from  many  different  locations.  The  waves  pass 
through  the  tissue  and  cross  at  one  point  where  constructive  interference  occurs.  At  this  location,  the 
wave  form  amplitude  is  significantly  higher  than  at  any  other  point  in  the  tissue.  This  area  of  higher 
molecular  vibration  results  in  hyperthermia.  In  addition,  the  increased  microwave  levels  have  been 
shown  to  aid  in  chemotherapy. 

The  current  drawback  to  this  technique  in  treatment  is  achieving  the  optimal  solution  of 
resolution  and  deep  penetration.  Electromagnetic  waves  can  penetrate  bone  and  muscle  structures 
with  little  reflection,  but  high  frequencies  have  not  been  able  to  penetrate  deeply  into  high  water- 
content  tissue  such  as  muscle.  This  has  made  resolution  at  depth  difficult  in  the  past. 

Using  a  computer  model,  this  paper  deals  with  focusing  high-frequency  electromagnetic  waves 
within  a  human  head.  Although  deep  penetration  through  the  muscular  tissue  of  the  brain  is  difficult, 
this  study  shows  that  with  the  combination  of  high  frequencies  and  constructive  interference, 
microwave  concentration  is  possible  even  for  tumors  deep  in  the  center  of  the  brain. 

Rappaport  and  Morgenthaler  [1]  derived  an  optimal  field  distribution  for  radiating  a  tumor 
within  a  homogeneous  sphere  of  muscle  tissue.  To  apply  this  theory  to  the  inhomogeneous  structure 
of  a  human  head,  the  Finite  Difference  Time  Domain  (FDTD)  technique  for  electromagnetics  is  used 
[2],  which  approximates  Maxwell's  differential  equations  as  finite  differences  and  directly  solves  them 
across  time  and  space. 

Methodology 

This  research  entails  setting  up  three  distinct  simulations.  The  first  duplicates  the  analytical 
results  using  FDTD  on  a  homogeneous  sphere  using  the  Penn  State  University  FDTD  code  [3].  A 
scries  of  FDTD  simulations  on  laminated  spheres  comprise  the  second  set  of  simulations.  This  set 
provides  insight  into  the  effects  of  inhomogeneities  on  the  propagation  of  the  ideal  source  radiation.  A 
series  of  FDTD  simulations  on  an  actual  model  of  a  human  head  developed  from  an  MRI  scan 
comprise  the  final  set  of  simulations.  The  incident  field  is  specified  in  the  FDTD  code  by  setting  up  an 
electric  shell,  with  a  radius  of  9.45  cm,  around  the  sphere  and  specifying  the  E-field  values  at  every 
cube  along  this  shell.  This  distribution  then  radiates,  and  the  field  propagates  through  the  FDTD  grid, 
producing  the  same  results  as  the  analytic  version.  This  verifies  the  use  of  FDTD  to  simulate  a 
constant  E-field  around  a  sphere. 


1  This  work  was  supported  by  the  LJ.  S.  Air  Force  Wright,  Annstrong,  and  Phillips  I.aboratories. 
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Code  Modifications 

Two  main  modifications  are  made  on  the  Penn  State  FDTD  code.  The  design  of  the  code  is  for 
use  in  antenna  calculations;  thus,  it  is  set  up  to  have  a  small  number  of  feeds,  one  for  each  antenna. 
Forcing  an  E-field  around  the  surface  requires  approximately  50,000  feeds  Specifying  o^e 
individually  proves  to  be  both  time  consuming  and  memory  intensive.  To  avoid  this  pitfall,  the  FEbU 
subroutine  is  modified  to  calculate  each  value  using  the  analytic  field  equations.  To  reduce  the 
computational  time  at  each  time  step,  the  subroutine  approximates  the  source  by  computing  the 
analytical  value  the  components  would  have  at  the  center  of  the  cube  and  gives  all  six  components 

their  value  based  on  this  result  rather  than  recalculating  the  field  for  every  component. 

The  second  major  modification  is  the  addition  of  a  peak  E-field  storage  array.  This  a  lows  the 
user  to  specify  an  area  over  which  the  peak  E-field  value  will  be  recorded.  This  is  only  u^ful  because 
the  FDTD  simulations  are  using  one  frequency.  By  recording  the  peak  value  of  the  E-fie!d,  the  power 

deposited  can  be  easily  calculated  using  P  =  a\E\^/2,  where  a  represents  the  conductivity  of  the 
material.  Thus  a  simple  conversion  into  the  frequency  domain  has  been  made  for  comparison  to  the 

^  These  two  modifications  allow  the  source  distribution  to  be  specified,  and  the  resultant  field  to 
be  recorded  and  analyzed.  The  first  modification  allows  the  FDTD  code  to  implement  the  source 
distribution  around  any  geometry  specified  in  the  FDTD  space.  The  code  propagates  the  field  through 
the  space  and  the  second  modification  allows  the  frequency  response  to  be  recorded  for  analysis  m 
MATLAB. 

FDTD  Simulations  and  Results 

Three  basic  forms  of  FDTD  simulations  are  done.  All  three  forms  are  set  up  with  identical 
FDTD  space  parameters,  but  the  geometry  is  changed  for  each  simulation.  For  these  simulations,  a  92 
X  92  X  95  grid  is  set  up  with  cubic  cells  2.55  millimeters  on  a  side.  The  Courant  stability  [2]  condition 
indicates  a  49.19  picosecond  time  step.  The  outer  sphere  in  each  simulation  has  a  radius  of  37  cells, 
or  9.45  cm.  The  analytical  solution  sets  the  spherical  source  distribution  at  the  radius  of  37  cells. 

The  first  simulation  is  on  a  simple  homogeneous  sphere.  This  sinaulation  serves  as  a 
verification  of  the  code.  It  is  to  demonstrate  the  ability  to  use  the  FDTD  code  with  the  entire  incident 

field  specified.  The  results  of  this  simulation  are  then  compared  with  the  analytic  result^. 

A  four  layer  model  comprises  the  next  simulation.  This  contains  an  outer  shell  of  bone  to 
simulate  a  bone-like  fluid  bolus,  a  thin  layer  of  muscle  tissue  simulating  the  skin  layer,  a  bone  sphere 
to  simulate  the  skull,  and  finally  an  inner  shell  of  muscle  simulating  the  brain  For  this  simulation  the 
radius  of  the  inner  sphere  is  32  cells.  The  outer  radius  of  the  bone  layer  is  34  cells,  and  the  radius  ot 

the  skin  shell  is  35  cells.  ,  r-  ,  ■  i  .•  twc 

An  FDTD  model  of  an  actual  human  head  provides  the  basis  for  the  final  simulations  ims 

model  is  made  by  converting  an  MRI  scan  of  a  human  head  into  a  head  mesh  to  be  read  m  by  the 
FDTD  code.  This  is  a  four  tissue  model  of  the  head  created  by  David  Steich  of  Penn  State  University. 
To  create  it,  the  actual  permeabilities,  permittivities,  conductivities  and  magnetic  conductivities  are 
never  directly  measured;  instead  typical  values  are  inserted  for  each  material  type.  This  provides  a 
first  order  approximation  to  the  human  head.  The  addition  of  a  basic  neck  extension  was  added  to  the 
head  to  simulate  the  inability  to  place  a  source  in  the  neck  region.  The  addition  of  a  neck  is  a  simple 

extension  of  the  final  layer  in  the  head,  down  to  the  edge  of  the  FDTD  grid. 

A  liquid  bolus  surrounds  this  head  approximation.  This  bolus  allows  the  use  of  the  spherical 
source  distribution  on  the  edge  of  the  bolus,  providing  a  closer  match  in  material  parameters  as  the 
field  propagates  into  the  head.  Although  it  is  about  9.45  cm  to  the  center  of  the  head,  it  is  not 
spherical;  therefore,  the  full  size  head  cannot  fit  within  the  9.45  cm  bolus  and  has  to  be  shrunk  tor  the 
initial  tests.  To  do  accomplish  this,  the  original  head  mesh  cell  size  of  3.2  mm  is  reduced  to  2.35 
millimeters. 
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The  material  characteristics  of  the  bolus  are  chosen  as  close  as  possible  to  those  of  muscle.  For 
the  next  simulation,  the  cell  size  increases  from  2.55  millimeters  to  3.2  millimeters  on  each  side  of  the 
FDTD  cubes.  This  simulation  provides  an  opportunity  to  determine  if  it  is  possible  to  increase  the 
penetration  depth  in  the  actual  head.  Brain  tissue  is  less  lossy  than  muscle  tissue;  therefore,  it  is 
conceivable  that  greater  penetration  may  be  possible  in  the  presence  of  the  inhomogeneities  of  the 
human  head. 

HOMOGENEOUS  SPHERE 

The  simulation  on  the  homogeneous  sphere  illustrates  FDTD  can  be  used  in  this  manner 
described.  The  difference  from  the  analytic  results  shows  only  a  6%  error  with  a  standard  deviation  of 
8.29%.  The  majority  of  this  error  comes  from  incomplete  source  coupling.  An  analysis  of  the  power 
profile  demonstrates  spikes  along  the  top  and  bottom  of  the  sphere.  This  effect  is  pronounced  at  the 
top  and  bottom  of  the  sphere  because  of  the  polarization.  Since  the  tangential  component  of  the  E- 
field  is  always  continuous  and  the  field  is  essentially  vertically  (z)  polarized,  the  field  along  the 
equator  is  tangential  and  couples  completely.  The  field  near  the  poles  is  normal  to  the  sphere's 
surface.  Because  the  normal  E-field  is  discontinuous  by  a  factor  of  the  difference  in  the  permittivities, 
the  field  at  the  poles  does  not  completely  couple  into  the  muscle  tissue,  thus  forming  the  spikes.  It  is 
believed  that  this  could  be  reduced  by  defining  the  sphere  to  extend  beyond  the  source  distribution, 
thereby  eliminating  this  discrepancy  in  material  parameters. 

Laminated  Sphere 

The  next  simulation  is  on  a  four-layer  sphere.  The  material  properties  of  the  outer  sphere 
model  bone-like  material,  the  next  layer  patterns  muscle  material,  the  third  layer  is  defined  as  bone 
material,  and  finally,  the  properties  of  the  inner  sphere  imitate  muscle  tissue.  Therefore,  this 
simulation  is  attempting  to  recreate  a  liquid  bolus  with  bone-like  electrical  characteristics  around  a 
head.  This  simulation  models  the  head  as  a  muscle-like  brain  core  surrounded  by  a  spherical  skull  of 
bone  and  finally  a  thin  skin  layer  of  muscle  tissue. 

Figure  I  shows  the  E-field  distribution  across  the  central  xy  cut  of  the  laminated  sphere 
normalized  to  one  at  the  center.  The  field  along  the  ^uator  of  the  sphere  is  predominately  directed  in 
the  z  direction.  Thus,  along  this  equatorial  cut,  the  field  will  be  tangent  to  the  surface  of  each  sphere. 
Because  the  tangential  E-field  has  to  be  continuous  across  a  dielectric  boundary,  along  this  cut  the 
field  should  be  continuous  everywhere.  Figure  1  shows  this  is  not  the  case.  There  are  E-field  spikes 
along  the  edges  of  the  inner  spheres.  The  stair  stepped  edges  of  the  spheres  cause  this  phenomenon. 
Instead  of  having  a  smooth  surface,  small  cubes  form  the  edge.  This  edge  produces  a  locally 
horizontal  rather  than  vertical  surface.  At  this  point  the  E-field  becomes  normal  rather  than  tangential. 
The  amount  of  the  discontinuity  of  the  normal  E-fields  is  proportional  to  the  difference  in  relative 
permittivities  of  the  two  materials;  therefore,  the  field  spikes  in  the  cube. 

Figures  1  and  2  illustrate  the  effect  of  the  staircase  errors.  Figure  1  shows  the  E-field 
distribution  across  the  sphere  in  a  two-dimensional  image.  The  spikes  appear  as  single  pixels  slightly 
darker  or  lighter  than  those  around  them.  Figure  2  is  plot  of  the  changes  in  the  IDTHRE  components 
from  level  z  =  39  to  level  z  =  40,  where  IDTHRE  is  an  array  containing  the  material  ID’s  for  the  z 
directed  geometry  components.  Wherever  the  IDTHRE,  or  z  directed  component,  changes  from  one 
level  to  the  next,  there  is  a  locally  horizontal  surface  where  it  should  be  vertical;  therefore.  Figure  2 
shows  only  the  location  of  each  stair  step.  Investigation  of  Figures  1  and  2  reveals  that  every  spike 
corresponds  to  the  edge  of  a  stair  step.  These  spikes  are  numerical  artifacts  which  will  not  actually 
occur. 

After  investigating  the  effects  of  the  simple  inhomogencities,  the  remaining  two  simulations 
again  use  the  model  of  the  head  developed  from  an  MRI  scan.  The  material  characteristics  define 
three  different  materials,  with  muscle-like  properties,  representing  skin  and  white  and  gray  matter,  and 
one  material  with  the  low-water  content  properties  of  bone.  The  first  of  these  uses  a  head  that  is 
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E-Field  Across  the  4-Layer  Sphere  at  Z=39.  E-Field  (V/m) 


Y  Location 


Figure  I.  2-D  image  of  the  E-fieltl  distribution  across  the  z  =  39  cut  of  the  4-layer  lamtnated  sphere. 

reduced  to  fit  within  the  9.45  cm  source  shell.  The  second  uses  a  full  size  head,  with  an  enlarged 
source  shehi  to  fit  around^^ 

no^aLed  r^lltelenJe™  E  wiU  Always"  be  ne“  U  a  ‘narrow  spike  near  the  center  indicates 
acciirite  focusinc  Total  Mean  (TM)  is  a  measure  of  the  mean  power  everywhere  else  in  the  head. 

Sons  The  ^ird  value  in  the  table  is  the  ratio  of  the  mean  power  at  the  center  to  the  total  mean 
Thp  hipher  this  value  the  easier  it  will  be  to  heat  the  tumor  without  affecting  healthy 

rr; 'S  toe  SteJs  .rpSn'm^ 

nercent  of  cells  with  power  >  1.  Each  of  these  points  will  reach  temperatures  higher  than  the  centra 
healthy  tissue. 
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Changes  in  IDTHRE  Components  of  4-Layer  Sphere  From  Z=40  to  Z=39. 


Y  Location 


Figure  2.  A  plot  of  the  location  of  the  stair  stepped  edges  in  layer  39  of  the  four-layer  sphere. 


The  spikes  exist  predominately  in  the  neck  region.  Because  there  is  an  discontinuity  in  the 
source  distribution  from  outside  and  inside  the  neck  region,  the  field  diffracts  along  the  edge  of  the 
source  distribution  introducing  the  large  spikes  proportional  to  this  discontinuity.  In  the  larger  head, 
the  higher  source  power  makes  the  stepped  edge  of  the  source  the  larger  contributor  to  the  diffracted 
field.  In  addition,  some  areas  along  the  outside  skin  could  reach  an  unhealthy  temperature. 

Conclusion 


This  research  demonstrates  that  FDTD  is  an  effective  tool  in  evaluating  several  vital  aspects  of 
microwave  hyperthermia  as  a  treatment  method.  Specifically,  the  initial  simulations  confirming  the 
optimal  analytical  solution,  derived  by  Rappaport  and  Morgenthaler,  demonstrated  the  utility  of  the 
FDTD  numerical  approach  for  analyzing  this  treatment  scenario.  The  Laminated  sphere  simulations 
demonstrated  the  problem  of  stair  step  spiking.  Finally,  the  head  simulations  demonstrated  that  in 
theory,  microwave  hyperthermia  is  a  possible  treatment  option. 
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Table  1.  Statistics  on  the  FDTD  Head  Model  Simulations 


The  simulation  on  the  inhomogeneous  laminated  sphere  demonstrates  the  effects  inherent  m  an 
FDTD  approach.  The  stair  step  approximation  of  a  spherical  surface  introduced  stair  step  spikes  due 
to  a  locally  horizontal  surface  along  an  otherwise  vertical  interface.  In  addition  to  these  spikes,  the 
FDTD  approach  proved  to  be  difficult  in  obtaining  complete  coupling  from  the  source  to  the  sphere  of 
treatment  These  drawbacks  not  only  demonstrate  the  effects  of  FDTD  approximations,  but  also 
introduce  an  area  to  be  considered  in  microwave  treatnient.  Because  the  head  is  not  spherical,  there 
will  be  horizontal  sections  of  interface  along  an  otherwise  vertical  interface.  For  example,  along  the 
base  of  the  mandible,  there  is  a  long  vertical  interface;  along  the  inside  of  the  occipital  cavity  there  are 

vertical  interfaces.  All  of  these  areas  will  be  candidates  for  a  true  spiking  phenomenon. 

The  simulations  on  the  actual  head  models  proved  very  encouraging.  The  source  Jstnbution 
placed  around  a  small  head  within  a  spherical  water  bolus  produced  excellent  focusing.  There  were 
only  20  cells  above  that  at  the  center,  all  of  which  were  confined  to  the  neck  region.  This  is  due  to  the 
stepped  edge  of  the  source  distribution  near  the  neck.  There  was  no  taper  down  as  the  source  neared 
the  edge  of  the  neck,  thus  launching  a  diffracted  field  near  the  edge  of  the  source  distribution,  creating 

several  small  but  strong  spikes  in  the  neck  region.  .  r  .  .u  .  .u 

These  spikes  are  troubling,  although  not  discouraging.  In  addition  to  the  fact  that  there  was  no 
attempt  made  to  taper  the  edge  of  the  field  distribution,  the  neck  region  itself  was  extremely 
approximate.  The  MRI  scan  produced  no  neck  region,  therefore  the  head  was  simply  continued  down 
from  the  base  to  the  edge  of  the  FDTD  space.  There  was  no  spinal  chord,  esophagus  or  trachea 
modeled  In  addition,  the  head  itself  was  approximated  from  an  MRI  scan.  Although  serving  to 
produce  an  example  of  what  could  be  possible  in  this  area,  it  still  needs  further  research  and 

rcfinem^l.  ^  demonstrated  two  important  issues.  First,  these  are 

the  first  simulations  to  show  FDTD  can  be  used  on  a  complete  3-D  model  of  a  human  head  in  a 
treatment  scenario.  Second,  and  most  important,  these  simulations  show  that  even  without  any  tom  ot 
optimization  to  account  for  the  inhomogeneous  structure  of  the  human  head,  it  is  possible  to  radiate  a 
deep-set  tumor  with  reasonable  precision. 
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Air  Force  Institute  of  Technology 


Introduction 

There  arc  many  potential  applications  for  Ultra-Wideband  (UWB),  short  pulse  radiating 
systems;  target  recognition,  collision  avoidance,  and  detection  through  lossy  materials  (such  as 
concrete)  are  some  examples.  These  systems  require  broadband  antennas  with  low  dispersion 
characteristics;  the  transverse  electromagnetic  (TEM)  horn  [1]  is  one  such  antenna. 

A  TEM  horn  antenna,  shown  in  Figure  1,  is  a  two-conductor,  end-fire,  traveling  wave  structure. 
With  proper  selection  of  the  flare  angle  and  plate  widths,  the  TEM  horn  maintains  a  constant 
impedance  and  radiates  only  the  TEM  mode.  Unfortunately,  the  abrupt  transition  from  the  conductors 
to  free  space  causes  diffraction,  which  increases  the  off  boresight  electric  field  strength.  Applying  a 
Tapered  Periodic  Surface  (TPS)  [2]  to  the  ends  of  an  antenna  eases  the  transition  from  conductive 
elements  to  free  space.  A  TPS  is  a  lattice  of  wire  or  slot  elements  with  progressively  shorter  lengths 
from  one  edge  of  the  surface  to  the  other.  A  wire  TPS  is  shown  in  Figure  2.  The  TPS  gradually  tapers 
from  a  low  reactance  (Z  =  0)  to  a  high  reactance  (Z  =  T;oo)  thereby  reducing  diffraction. 

This  work  has  two  main  purposes:  first,  to  reduce  the  off-boresight  electric  field  levels  for  a 
TEM  horn  antenna,  and  second,  to  maximize  the  on-boresight  peak-to-peak  electric  fields  over  a 
bandwidth  of  20:1.  These  goals  are  met  by  applying  a  TPS  to  the  ends  of  the  horn.  Designs  are 
presented  for  a  TEM  horn  antenna  and  for  a  TPS.  The  two  can  be  designed  separately  keeping  in  mind 
that  the  goal  is  to  reduce  the  off-boresight  fields  for  bandwidth  of  300  MHz  to  6  GHz.  In  general,  the 
design  of  a  TEM  horn  requires  selecting  the  length  and  width  at  the  aperture,  and  selecting  the  flare 
angle  between  the  plates.  The  parameters  are  adjusted  to  meet  the  four  design  criteria  for  an  ultra- 
wideband,  short  pulse  radiating  system,  namely:  1)  constant  amplitude  response,  2)  linear  phase 
response,  3)  no  reflections  or  resonances  along  the  conductors,  and  4)  wide  bandwidth. 

In  the  process  of  designing  a  TPS,  a  Periodic  Moment  Method  (PMM)  computer  code  [3] 
models  each  of  the  TPS  elements  and  computes  the  reflection  coefficient  for  each  element  versus 
frequency.  From  the  reflection  coefficient,  a  transmission  line  model  determines  the  sheet  impedance 
for  each  element.  The  final  TPS  geometry  is  then  determined  by  approximating  a  known  impedance 
function.  After  the  proper  geometry  is  determined,  a  two-dimensional  Finite  Difference  Time  Domain 
(FDTD)  code  [4]  models  the  TPS.  This  model  predicts  the  reduction  of  field  levels  in  the  shadow 
region  of  the  model.  After  both  the  TEM  horn  and  TPS  designs  arc  validated,  the  testing  begins. 
Testing  of  the  TEM  horn  and  TPS  attachments  is  performed  at  the  High  Energy  Research  and  Test 
Facility  of  the  USAF  Phillips  Laboratory. 

Methodology  and  Results 
TEM  horn  Design 

There  are  three  main  aspects  to  the  design  of  a  TEM  horn  [5,  6];  the  low  frequency  cutoff,  the 
high  frequency  cutoff,  and  the  characteristic  impedance.  The  cutoff  frequencies  determine  the  limits  of 
the  horn’s  performance;  the  characteristic  impedance  provides  the  details  necessary  to  designing  the 
feed  to  the  antenna.  The  length  of  the  conductors  determines  the  lowest  frequency  that  propagates  in  a 
TEM  horn.  The  low  frequency  cutoff  (6  dB)  is  given  by  [5]: 


1  ITis  work  was  supported  by  the  U.  S.  Air  Force  Wright  and  Phillips  Laboratories. 
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Figure  1.  TEM  Horn  Antenna 


Figure  2.  Wire  Type  of  Tapered  Periodic  Surface 
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where  L  is  the  length  of  the  horn  and  c  is  the  speed  of  light.  Therefore,  the  length  m us  be  at  least 
19  69"  to  propagate  a  signal  with  300  MHz  components.  The  high  frequency  cutoff  is  determined  by 
Jhe  Ire  an7e  be7ween^he  plates,  and  consequently,  the  height  of  the  TEM  horn  at  the  aperture. 
Therefore,  the  upper  cutoff  frequency  (6  dB)  is  given  by  [6]: 


fhigh  — 


(.604)c 

2Lsin^(»/) 


(2) 


where  the  /3  is  the  flare  angle  between  the  plate  and  the  ground  plane.  The  highest  d^'red  Muency 
is  6  GHz;  the  flare  angle  between  the  plate  of  the  TEM  horn  and  the  ground  plane  is  20  ,  the  heig 

width^of  the  TEM  horn  at  the  aperture  defines  the  characteristic  impedance  of 
the  TEM  horn.  For  a  TEM  horn  mounted  above  a  ground  plane,  the  charactenstic  impedance  is 
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where  K  is  the  antenna  sensitivity,  w  is  the  width  of  the  horn  at  the  aperture,  and  h  is  ^ight  above 
the  ground  plane.  In  order  to  keep  the  phase  difference  less  than  30  ,  the  width 
5.078";  a  width  of  4.921"  was  chosen.  The  corresponding  charactenstic  impedance  is  288 
assuming  the  sensitivity  K  =  1. 

TPS  Design 

The  purpose  of  a  TPS  is  to  gradually  change  the  impedance  along  the 
Lavers  of  conductive  strips  with  a  dielectric  substrate  between  are  capacitive  in  nature  [7],  as  the 

amount  of  overlap  in  the  Lips  decreases,  the  capacitance  decreases,  hi  Th^'foriaver 

-/•oo.  The  PMM  cLe  [3]  models  the  geometry  of  the  TPS  elements  as  a  double  ^  . 

consists  of  copper  conductive  strips  separated  by  gaps;  the  bottom  layer  has  an  identical  pattern  o 
SSSve  .“rfp';  shifted  back  by  oV".V  slab  of  glass  epoxy  0  w.lb  c.  =  45  separa^^ 

layers.  Fifty-nine  elements  ate  designed  with  starting  with  strip  width  =  0.p  (gap  width  -  0.01  )  and 
decreasing  to  strip  width  =  0.29"  (gap  width  =  0.59")  in  increments  of  0.01  .  Each  element  has  a 
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Figure  3.  Approximate  vj.  Desired  Triangular  Sheet  Impedance  Function  at  3  GHz 
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Figure  4.  Approximate  vj.  Desired  Exponential  Sheet  Impedance  Function  at  3  GHz 

different  reflection  coefficient,  and  therefore,  a  different  impedance.  From  this  pool  of  fifty-nine 
elements,  twenty-five  elements  are  chosen  to  approximate  specific  impedance  functions. 

Using  a  transmission  line  model  the  sheet  impedance  of  each  element  is  determined  from  the 
reflection  coefficient  by: 


=  where  Zc  =Zocos77  (4) 

2R 

where  Zo  is  the  characteristic  of  free  space  (377  Q),  and  77  is  the  angle  of  the  normal  (80").  Two  taper 
functions  are  designed,  an  exponential  taper  and  a  triangular  taper,  with  impcdcnces  given  by  [8]: 

Z..p(d)  =  where  0  <  d  <  1  (5) 
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(6) 


Z..i(d)  = 


ZoC 

ZioC 


2d*ln(7000/Zo) 


(4d-2d^-l)ln(7000/Z..) 


for  0  <  d  < 
for  K  ^  d  <  1 


The  desired  impedance  taper  is  a  function  of  length;  the  taper  element  impedances  are  functions 
of  gap  width.  Matching  the  impedances  determines  the  gap  width  as  a  function  of  length. 
Synthesizing  a  taper  function  from  the  gap  width  versus  length  provides  an  approximate  impedance 
function.  The  approximate  taper  function  fits  25  impedance  points  to  the  desired  taper  function  using  a 
smallest  difference  method.  Figures  3  and  4  depict  just  how  well  the  approximate  function  fits  the 
desired  function. 


FDTD  MODELING 

A  two-dimensional  FDTD  code  [4]  models  the  tapered  periodic  structures.  The  simulations 
predict  the  ability  of  the  TPS  design  to  reduce  the  off-boresight  fields.  There  arc  four  diprent 
geometries:  a  triangular  taper,  a  exponential  taper,  a  short  reference  metal  plate,  and  a  long  metal  plate. 
The  short  plate  is  19.69"  long  and  represents  the  original  reference  TEM  horn.  The  long  plate  is 
41  69"  long  and  represents  a  TEM  horn  as  long  as  both  the  original  horn  plus  the  TPS  attachment. 

Figure  5  shows  the  diffracted  fields  from  the  exponential  taper,  triangular  taper,  and  long  plate 
relative  to  the  field  diffracted  from  the  short  reference  plate.  The  long  plate  provides  belter  reduction 
at  the  shallow  angles,  but  the  tapers  decrease  the  fields  far  off  boresight.  The  exponential  taper 
outperforms  the  triangular  taper  in  almost  all  cases.  At  the  90°  test  location,  the  exp^ential  taper 
reduces  the  diffraction  by  7.839  dB.  The  triangular  taper  reduced  the  fields  4.213  dB.  The  long  plate 
decreases  the  diffraction  by  only  0.04  dB. 


TESTING  PROCEDURE 

The  High  Energy  Research  and  Technology  Facility  of  Phillips  Laboratory  at  Kirtland  AFB, 
New  Mexico  provides  the  facilities  and  equipment  to  test  the  three  designs.  Table  1  displays  the  peak 
values  of  the  fields  at  each  location  and  the  amount  of  reduction  performed  by  each  taper.  The  first 
column,  Test  Pt.,  gives  the  test  location  for  each  measurement;  the  second  row,  pulse,  is  the  mam 
radiated  pulse.  The  other  measurements  are  the  diffracted  field  levels,  the  signal  level  is  measured  in 
millivolts,  and  the  Reduction  (dB),  is  calculated  as  follows: 


Reduction  (dB)  =  20  logf ^ 

Vpeak,  ref  J 

The  exponential  taper  outperforms  the  triangular  taper  at  all  locations  except  the  20°  off- 
boresight  test  point.  Even  for  the  on-boresight  maximization  that  did  not  behave  as  expected,  the 
exponential  taper  approached  the  goal  more  closely  than  the  triangular  taper.  The  20  ou-boresight 
mark  is  a  confusing  point.  Since  that  is  the  angle  of  elevation  of  the  TEM  horn,  the  location 
corresponds  to  grazing  incidence  for  the  radiating  pulse  and  any  diffraction  occurring  at  the  edge. 
Diffraction  is  very  weak  near  grazing  and  is  difficult  to  predict  accurately.  The  discrepancy  at  this 
location  is  due  to  slight  shifts  in  the  probe  location. 

Conclusions 

This  research  involved  the  design,  modeling,  and  testing  of  both  a  TPS  and  a  TEM  horn  The 
goal  was  to  reduce  the  off-boresight  fields  for  a  TEM  horn  and  increase  the  peak-to-peak  field  levels 
on  boresight.  This  was  accomplished  by  applying  a  TPS  to  the  free  space  end.  The  TPS  reduced  the 
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FDTD  Simulation  Responses 
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Figure  5.  Two-dimensional  FDTD  Field  Levels  Relative  to  the  Field  Diffracted  from  the  Reference  Plate 


1  TestPt. 

Ref.  Horn 

Reduction 

Reduction 

13.1700 

9.86849 

-2.507 

9.11670 

-3.195 

0 

-8.87810 

-3.354 

-6.69750 

-2.448 

20 

5.34938 

+7772 

5.22781 

-0.200 

30 

9.63531 

-1.381 

8.65000 

40 

8.78906 

5.79719 

-3.614 

-1.873 

50 

6.59125 

3.67563 

-5.073 

4.54060 

-3.237 

60 

5.31500 

2.58125 

-6.273 

2.82281 

-5.496 

70 

4.61313 

1.77188 

-8.311 

2.28313 

80 

3.94125 

1.36625 

-9.202 

1.64688 

-7.591 

90 

3.57531 

1.12438 

-10.048 

1.37875 

-8.277 

Table  1.  Test  Results  for  Diffracted  Field  Level 

diffraction  from  the  free  space  end  by  providing  a  gradual  transition  from  the  conductive  plates  of  the 
TEM  horn  to  free  space. 

Two  TPSs  were  designed  —  one  that  approximated  an  exponential  impedance  function  and  one 
that  approximated  a  triangular  impedance  function.  Both  reduced  the  off-boresight  fields  of  the  TEM 
horn.  In  most  cases  the  exponential  taper  reduced  the  field  levels  further  than  the  triangular  taper.  For 
example  at  90“  off  boresight,  the  exponential  taper  reduced  the  edge  diffracted  field  by  10  dB.  The 
triangular  taper  reduced  the  fields  by  8.2  dB. 

Therefore,  this  research  illustrates  that  a  TPS  is  an  effective  method  of  reducing  diffraction.  A 
procedure  for  designing  a  TPS  to  fit  a  specific  impedance  function  was  presented.  A  two-dimensional 
FDTD  model  predicted  the  tapers  would  reduce  the  diffraction,  and  experimentation  verified  the  TPS's 
ability  to  reduce  the  peak  off-boresight  field  levels  for  a  TEM  horn. 
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Abstract 

A  technique  to  approximate  the  complex  aperture  distribution  of  an  antenna  by  a  set  of  discrete 
sources  has  been  developed.  The  locations  for  a  set  of  sources  are  found  from  the  far-field  radiation 
pattern  using  an  imaging  technique.  The  far-field  electric  field  equations  for  this  set  of  electric  and 
magnetic  sources  are  transformed  into  a  set  of  matrix  equations,  which  are  then  solved  for  the  source 
magnitudes  and  pha.ses.  These  sources  can  then  be  used  to  solve  for  the  antenna's  radiation  pattern 
performance  in  a  complex  structural  environment.  This  distributed  source  technique  was  applied  to 
the  case  of  a  planar  spiral  antenna  on  the  edge  of  a  small  ground  plane. 


I.  Introduction 

Placing  a  planar  spiral  antenna  on  a  complex  platform  can  require  considerable  effort  to  ensure 
that  the  surrounding  structures  do  not  unduly  affect  the  radiation  pattern.  Decisions  regarding  antenna 
placement  can  be  made  through  anechoic  chamber  studies  or  computational  electromagnetic 
simulations.  This  study  was  designed  to  validate  the  u.se  of  the  Ohio  State  Basic  Scattering  Code 
(NECBSC)^’2  for  modeling  this  antenna  on  a  simple  ground  plane. 

NECBSC  is  a  high  frequency  ray  tracing/diffraction  code  based  on  the  Uniform  Geometrical 
Theory  of  Diffraction  (UTD).  In  general,  NECBSC  is  suited  for  structures  many  wavelengths  in 
dimension.  Other  codes  using  techniques  .such  as  the  Method  of  Moments  (MoM)  or  the  finite 
difference  time-domain  (FDTD)  method  provide  more  exact  modeling  of  the  antenna,  but  become 
computationally  impractical  when  other  large  structures  must  be  modeled  near  the  antenna.  NECBSC 
can  be  used  in  a  timely  manner  to  provide  mechanical  designers  information  on  the  effects  of  their 
designs  on  antenna  performance. 
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NECBSC  models  antennas  as  a  set  of  infinitely  small  electric  and  magnetic  current  elements. 
It  can  also  approximate  an  antenna  using  linear  interpolation  on  the  far-field  radiation  pattern  (the 
interpolated  source  technique).  With  this  technique  the  fields  radiate  from  a  single  point  weighted  by 
the  pattern  function  interpolated  from  the  measured  horizontally  and  vertically  polarized  patterns. 

For  NECBSC  modeling  of  a  spiral  antenna  mounted  on  the  edge  of  a  small  (4?^  x  4A.)  ground 
plane,  the  interpolated  source  technique  is  not  accurate  because  of  the  point  source  approximation. 
This  paper  presents  a  distributed  source  technique  to  reduce  the  spiral  antenna  to  a  set  of  sources 
located  at  the  radiating  regions  of  the  spiral,  thereby  approximating  the  near- zone  fields  of  the  antenna 
more  accurately  than  is  possible  using  the  interpolated  source  technique.  This  improved 
approximation  to  the  near-zone  fields  on  the  antenna  is  important  when  calculating  the  radiation 
pattern  of  the  antenna  in  the  presence  of  other  large  structures.  The  technique  presented  in  this  paper 
is  two-fold.  First,  the  locations  for  a  set  of  sources  is  found  from  the  far-field  radiation  pattern  using 
an  imaging  technique.  Next,  the  far-field  electric  field  equations  for  this  set  of  electric  and  magnetic 
sources  are  transformed  into  a  set  of  matrix  equations.  These  are  then  solved  for  the  source 
magnitudes  and  phases.  Figure  1  shows  a  schematic  of  this  procedure. 


Cavity  Backed  Spiral 


Set  of  sources  with  computed 
amplitudes  and  phases  that  generate 
the  spiral  far-field  pattern 


Fig.  1  Distributed  Source  Technique. 


The  far-field  transformation  technique  was  first  developed  by  Mautz  and  Harrington-^  and 
Pelton,  Marhefka,  and  Burnside^  for  2-D  radiation  patterns.  This  paper  expands  their  technique  to 
3-D  patterns  using  an  imaging  technique  to  determine  the  placement  ot  the  sources.  This  technique  is 
then  applied  to  a  planar  cavity  spiral  antenna  with  the  results  compared  to  measurements  and  the 
NECBSC  interpolated  source  technique. 


91 1 


II.  Planar  Spiral  Antenna 


The  antenna  used  for  all  the  modeling  work  is  an  8-arm  right-hand-circular  (RHC)  log-periodic 
spiral.  Each  arm  has  six  full  turns,  starting  at  a  radius  of  0. 125  inches  and  ending  at  a  radius  of  1 1 .0 
inches.  The  operational  frequency  is  from  0.5  to  1  GHz,  Figure  1  shows  a  wire  model  of  the  spiral 
antenna. 

The  antenna  is  placed  at  the  top  of  an  absorber  filled  cavity  that  is  6.3"  in  depth.  The  spiral 
produces  radiation  in  both  forward  and  backward  directions.  Tlie  downward  radiation  is  absorbed  by 
the  cavity,  reducing  the  total  output  power  by  half.  One  advantage  of  the  spiral  is  its  wide  bandwidth. 
This  eliminates  the  need  for  using  more  than  one  antenna  to  cover  a  wide  range  of  frequencies.  The 
spiral  also  has  the  advantage  of  being  conformal  to  a  surface,  thereby  not  blocking  the  radiation  of 
other  emitters. 


III.  Antenna  Imaging 

Imaging  the  antenna  cuixent  distribution,  Jj^  y  ^(x,y,z),  from  the  far-field  radiation  pattern  is 
necessary  to  determine  optimum  placement  for  the  sources  to  be  used  with  NECBSC.  The  imaging 
technique  used  in  this  paper  is  based  on  work  by  Cook,  Anderson,  Whitaker,  and  BennetC  and 
reduces  to  an  integration  of  the  radiation  pattern  (Eq,E^}  over  one  hemisphere  given  by 

C  ( 1 ) 


where 


E^(0,e)^  Eq{<I>, 6)  cos (p  cos6 -£0(0,0)  sin  0 

(2a) 

£,.(0,0)  =  £^(0,0)  sin  0  cos 0  4- £0(0,0)  cos0 

(2b) 

£,(0,0)  =-£^(0,0)  sin  0 

(2c) 

and  k  -  IjijX  is  the  free  space  wavenumber  and  C  is  a  con.stant.  Equation  (1)  can  be  integrated 
directly  or  evaluated  using  FFT  techniques.  Figure  2  shows  a  3-D  view  of  the  far-field  of  the  spiral. 
Figure  3  shows  the  noimalized  magnitude  of  the  electric  current  distribution  calculated  over  the  top 
surface  of  the  antenna  using  Equation  (1).  Figure  4  represents  a  slice  through  the  center  of  the  image. 


912 


Fig.  2  Far-field  radiation  pattern  of  the  spiral  antenna  at 
0.7  GHz. 


Fig.  3  Normalized  magnitude  of  the  electric  current 
distribution  on  the  top  surface  of  the  planai'  spiral  at 
0.7  GHz. 


p  laceme  nt  (wave  le  ngths) 


Fig.  4  Nonnali/.ed  magnitude  of  the  electric  current  distribution  on  the  top  surface  of 
the  planar  spiral  at  0.7  GHz  (cut  through  center  of  antenna). 


Equation  (1)  was  used  at  three  frequencies  to  detenmine  the  optimum  placement  of  the  sources. 
In  each  case,  the  optimum  placement  was  at  a  radius  of  approximately  0.45  wavelengths.  This  is  40% 
greater  than  the  placement  that  would  be  used  if  it  was  assumed  that  the  spiral  radiates  most  of  its 
energy  in  mode  2  at  a  circumference  of  two  wavelengths,  i.e.  a  radius  of  Mu  wavelengths.  Evidently, 
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the  cavity  around  the  spiral  and  the  shape  of  this  eight  arm  spiral  influence  the  current  distribution 
enough  to  invalidate  generic  assumptions  about  the  cuirent  distribution  on  the  surface  of  this  spiral 
antenna.  When  a  radius  of  0.45  wavelengths  is  not  used  for  the  source  locations,  the  calculation  of  the 
sources,  described  in  the  next  section,  produces  large  magnitude  sources  which  produce  the  specified 
far-field  pattern  through  cancellation  only  if  sufficient  numerical  precision  is  used  in  the  calculations. 
Using  these  incorrectly  placed  sources  in  NECBSC  calculations  for  the  spiral  over  the  ground  plane 
produces  inaccurate  results,  probably  indicating  that  the  near-zone  fields  are  less  accurate 
approximations  to  the  actual  near-zone  fields. 


'Ee' 

'Tu 

Ti2 

^13  ^14 

T’lS 

Ti6] 

721 

T22 

^23  ^24 

T25 

T26\ 

IV.  Calculation  of  the  Sources 

Once  the  appropriate  locations  for  the  sources  are  determined,  the  magnitude  and  phase  of 
sources  are  obtained  using  a  3-D  extension  to  the  technique  developed  by  Mautz  and  Harrington-^  and 
Pelton,  Marhefka,  and  Burnside'^  for  2-D  patterns.  The  magnitude  and  phase  of  the  sources  are  given 
by  the  solution  to  the  following  matrix  equation 

Jx 

Jy 

7, 

My 

(3) 

where  the  T  submatrices  are  the  far-field  radiation  patterns  of  the  respective  sources  and  are  given  by 

^jk[{x„  CQ^<p,n+y„  cos0,„] 

r  -_T  rocP  CO.s  .sin  )sin  +c„  COS  0, 

'  I2(n,m)  ~  ‘  25(n.m)  Sin^^COSo^e 

^22(n,m)-  O 5(«,m)  “  COS C 

jk[{x„  cos(^,„+y„  e„,+z„  cos^^,„] 


^210;, m)  -  ^14(«,m)  “  ~sin  (p^  e 


(4b) 

(4c) 

(4d) 


^13(«,m)  ^26{njn)  ^in  0^  c 


^2Mn.m)  ~ 


1  =  0 


(4e) 

(40 
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and  the  vector  contains  the  unknown  complex  coefficients  of  the  N  electric 

and  magnetic  point  sources  located  at  for  n  =  \ ...  N .  The  vector  {Eq,E^)  contains  the 

complex  coefficients  of  the  M  known  far-field  points  located  at  for  m  =  l...M.  The 

elements  of  the  T  submatrices  can  be  modified  to  take  into  account  sources  other  than  point  sources, 
including  one-sided  sources  such  as  those  available  in  NECBSC.  This  is  inherently  a  least  squares 
problem  since  we  are  approximating  a  function  by  a  sum  of  other  functions,  i.e.  the  far-field  pattern  by 
the  sum  of  the  far-field  patteims  of  the  point  sources.  As  a  consequence,  the  number  of  point  sources, 
A,  must  be  significantly  less  than  the  number  of  far-field  points,  M ,  to  avoid  the  problems  typically 
encountered  with  an  over  specified  least  squares  problem.  One  typical  problem  is  that  the  resulting 
far-field  pattern  could  vary  greatly  between  the  specified  far-field  points  rather  than  exhibiting  a 
smooth  transition  from  one  known  fai-field  point  to  another.  The  particular  problem  we  observed 
when  using  either  excessive  or  incorrectly  placed  sources  is  that  the  resulting  source  magnitudes  will 
exceed  reasonable  magnitudes  and  only  achieve  the  least  squai'es  fit  to  the  far-field  points  through 
severe  cancellation  which  requires  that  all  calculations  be  done  using  excessive  numerical  precision. 
Every  case  with  excessive  source  magnitudes  gave  very  poor  results  when  incorporated  into  a 
NECBSC  model.  In  addition,  the  number  of  point  sources.  A,  should  be  minimized  since  the 
computational  complexity  increases  as  order  MN^ 


The  antenna  could  be  represented  by  electric  cuixents  alone;  however,  by  Mayes  theorem, 
electric  and  magnetic  sources  generate  the  same  fields  outside  the  region  containing  the  sources  if 


VxM 


M  =  - 


VxJ 

j(i)e 


(5a) 

(5b) 


i.e.  for  a  given  electric  source  there  is  an  equivalent  magnetic  source  and  vice  versa.  Since  Mayes 
theorem  involves  the  derivative  of  the  spatial  distribution  of  the  currents,  one  of  the  representations 
for  a  given  source  will  be  more  spatially  concentrated  than  the  other.  For  this  reason,  both  electric  and 
magnetic  sources  are  used  in  this  paper. 

In  the  initial  modeling,  eight  source  locations  were  used  spaced  45°  apart  at  a  radius  of  0.45 
wavelengths.  Each  location  had  three  electric  and  three  magnetic  x-,  y-,  and  z-directed  sources.  This 
was  found  to  give  good  results  but  had  1-2  dB  too  much  ripple  in  the  azimuth  patterns  near  the 
horizon.  To  produce  a  more  uniform  ring  of  cuirents  sixteen  source  locations  with  a  spacing  of  22.5 
were  tried  next.  This  over  specified  the  least  squares  problem  creating  the  problems  discussed 
previously.  As  an  alternative,  the  48  sources  from  the  first  solution  were  linearly  interpolated  around 
the  ring  into  a  total  of  96  sources  spaced  22.5°  apart.  This  gave  a  stable  solution  and  matched  the 
measurements  well. 
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V.  Antenna  Alone:  Distributed  Source  Technique  Versus  Measurements 

Figures  5  through  12  compare  the  distributed  source  calculations  and  measurements  for  the 
spiral  antenna  without  a  ground  plane.  The  results  presented  here  were  computed  using  a  far-field 
pattern  consisting  of  elevation  cuts  with  A0=2°  and  A^=10°  at  0.5  GHz.  Elevation  cuts  at  0°  and 
60°  degrees  and  azimuth  cuts  at  60°  and  80°  from  boresight  are  shown  for  both  horizontal  and  vertical 
polarizations.  No  scaling  factor  was  u.sed  in  any  of  these  calculations.  In  general,  both  polarizations 
have  good  matches  to  the  measurements;  however,  the  calculated  horizontal  polarization  does  not 
have  as  deep  a  dip  at  130°  in  the  elevation  patterns  as  the  measurements. 


Fig.  5  Experimental  versus  NECBSC  with  96  sources 
interpolated  from  48  sources,  0.5  GHz,  no  plates, 
horizontal  polarization,  elevation  cut  at  0  degrees. 


Anjtt,  d«5 

Fig.  6  Experimental  versus  NECBSC  with  96  sources 
interpolated  from  48  sources,  0.5  GHz,  no  plates,  vertical 
polarization,  elevation  cut  at  0  degrees. 


Fig.  7  Experimental  versus  NECBSC  with  96  sources 
interpolated  from  48  sources,  0.5  GHz,  no  plates, 
horizontal  polarization,  elevation  cut  at  60  degrees. 


Ariglt,  defl 


Fig.  8  Experimental  versus  NECBSC  with  96  sources 
interpolated  from  48  sources,  0.5  GHz,  no  plates,  vertical 
polarization,  elevation  cut  at  60  degrees. 
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Fig.  9  Experimenial  versus  NECBSC  with  96  sources 
interpolated  from  48  sources,  0.5  GHz,  no  plates, 
horizontal  polarization,  azimuth  cut  at  80  degrees. 
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Fig.  10  Experimental  versus  NECBSC  with  96  sources 
interpolated  from  48  sources,  0.5  GHz,  no  plates,  vertical 
polarization,  azimuth  cut  at  80  degrees. 
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d«o 

Fig.  11  Experimental  versus  NECBSC  with  96  sources 
interpolated  from  48  sources,  0.5  GHz,  no  plates, 
horizontal  pohirization,  azimuth  cut  at  60  degrees. 


0  40  80  120  160  200  240  280  320  360 


Fig,  12  Experimental  versus  NECBSC  with  96  sources 
interpolated  from  48  sources,  0.5  GHz,  no  plates,  venical 
polarization,  azimuth  cut  at  60  degrees. 
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VI.  Antenna  on  Edge  of  Small  Ground  Plane:  Distributed  Source  Technique 
Versus  Measurements 

The  distributed  source  calculations  and  measured  far-field  azimuth  patterns  of  the  spiral  on  the 
edge  of  a  small  ground  plane  are  compared  next.  The  geometry  is  shown  in  Figure  13.  The  ground 
plane  is  a  square  66  inches  on  a  side.  The  spiral  was  set  at  heights  of  0.3,  2.2,  and  6.3  inches  above 
the  ground  plane  and  the  radiation  pattern  was  measured  at  the  frequencies  0.5,  0.7,  and  0.9  GHz. 


The  ground  plane  was  modeled  as  a  six-sided  plate,  with  the  three  points  clo.sest  to  the  spiral 
set  at  a  distance  of  12.2"  from  the  center  of  the  spiral.  Figures  14  through  19  show  the  results  for  0.5 
GHz.  The  plots  are  azimuth  cuts  taken  60  degrees  from  boresight.  Figures  14,  16,  and  18  are  the 
horizontal  polarizations  and  Figures  15,  17,  and  19  are  the  vertical  polarizations.  Horizontal  and 
vertical  polarizations  for  0.7  GHz  and  0.9  GHz  are  shown  in  Figures  20  through  25  and  Figures  26 
through  31,  respectively.  No  scaling  factor  was  used  in  any  of  these  calculations.  As  can  be  .seen, 
good  agreement  is  obtained  between  the  distributed  source  calculations  and  the  measurements. 


Angit,  d*g 


Fig.  14  Experimental  versus  NECBSC  with  96  sources 
interpolated  from  48  sources,  0.5  GHz,  plate  0.3"  below 
spiral,  horizontal  polarization,  azimuth  cut  at  60  degrees. 


Fig.  15  Experimental  versus  NECBSC  with  96  sources 
interpolated  from  48  sources,  0.5  GHz.  plate  0.3"  below 
spiral,  vertical  polarization,  azimuth  cut  at  60  degrees. 


Fig.  16  Experimental  versus  NECBSC  with  96  sources 
interpolated  from  48  sources,  0.5  GHz,  plate  2.2"  below 
spiral,  horizontal  polarization,  azimuth  cut  at  60  degrees. 


Fig.  17  Experimental  versus  NECBSC  with  96  sources 
interpolated  from  48  sources,  0.5  GHz,  plate  2.2"  below 
spiral,  vertical  polarization,  azimuth  cut  at  60  degrees. 


Angi*.  dtg. 


Fig.  18  Experimental  versus  NECBSC  with  96  sources 
interpolated  from  48  sources,  0.5  GHz,  plate  6.3"  below 
spiral,  horizontal  polarization,  azimuth  cut  at  60  degrees. 


Fig.  19  Experimental  versus  NECBSC  with  96  sources 
interpolated  from  48  sources,  0.5  GHz,  plate  6.3"  below 
spiral,  vertical  polarization,  azimuth  cut  at  60  degrees. 
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Fig.  20  Experimental  versus  NEC  BSC  with  96  sources  Fig.  21  Experimental  versus  NECBSC  with  96  sources 

interpolated  from  48  sources,  0.7  GHz,  plate  0.3"  below  interpolated  from  48  sources,  0.7  GHz,  plate  0.3"  below 

spiral,  horizontal  polarization,  azimuth  cut  at  60  degrees.  spiral,  vertical  polarization,  azimuth  cut  at  60  degrees. 


Fig.  22  Experimental  versus  NECBSC  with  96  sources  Fig.  23  Experimental  versus  NECBSC  with  96  sources 

interpolated  from  48  sources,  0.7  GHz,  plate  2.2"  below  interpolated  from  48  sources,  0.7  GHz,  plate  2.2"  below 

spiral,  horizontal  polarization,  azimuth  cut  at  60  degrees.  spiral,  vertical  polarization,  azimutii  cut  at  60  degrees. 
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Fig.  24  Experimental  versus  NECBSC  with  96  sources  Fig.  25  Experimental  versus  NECBSC  with  96  sources 

interpolated  from  48  sources,  0.7  GHz,  plate  6.3"  below  interpolated  from  48  sources,  0.7  GHz,  plate  6.3"  below 

spiral,  horizontal  polarization,  azimuth  cut  at  60  degrees.  spiral,  vertical  polarization,  azimuth  cut  at  60  degrees. 
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Fig.  26  Experimental  versus  NECBSC  with  96  sources 
interpolated  from  48  sources.  0.9  GHz,  plate  0.3"  below 
spiral,  horizontal  polarization,  azimuth  cut  at  60  degrees. 


Fig.  27  Experimental  versus  NECBSC  with  96  sources 
interpolated  from  48  sources,  0.9  GHz,  plate  0.3"  below 
spiral,  vertical  polarization,  azimuth  cut  at  60  degrees. 


Fig.  28  Experimental  versus  NECBSC  with  96  sources 
interpolated  from  48  sources,  0.9  GHz,  plate  2.2"  below 
spiral,  horizontal  polarization,  azimuth  cut  at  60  degrees. 


Fig.  29  Experimental  versus  NECBSC  with  96  sources 
interpolated  from  48  sources,  0.9  GHz,  plate  2.2'  below 
spiral,  vertical  polarization,  azimuth  cut  at  60  degrees. 
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Fig.  30  Experimental  versus  NECBSC  with  96  sources 
interpolated  from  48  sources,  0,9  GHz,  plate  6.3 "  below 
spiral,  horizontal  polarization,  azimuth  cut  at  60  degrees. 


Fig.  31  Experimental  versus  NECBSC  with  96  sources 
interpolated  from  48  sources,  0.9  GHz.  plate  6.3  below 
spiral,  vertical  polarization,  azimuth  cut  at  60  degrees. 
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The  match  between  calculations  and  measurements  is  better  at  the  higher  frequencies,  as 
would  be  expected  with  any  UTD  code.  It  should  be  noted  that  no  scaling  factor  was  used  in  the 
calculations. 


VII.  Antenna  on  Edge  of  Small  Ground  Plane:  Interpolated  Source  Technique 
Versus  Measurements 

Figures  32  to  37  show  the  results  of  NECBSC  calculations  using  the  interpolated  source 
technique  with  the  spiral  on  a  ground  plane  at  varying  heights.  With  this  technique,  the  measured  far- 
field  pattern  of  the  antenna  is  input  into  the  NECBSC  code  and  is  used  as  the  pattern  function  for  a 
point  source.  The  ground  plane  plate  was  input  as  a  square  plate  with  one  corner  near  the  source. 
This  was  found  to  give  a  better  match  to  the  data  than  using  a  six-  or  eight-sided  plate  with  the  edge 
rounded  near  the  point  source.  A  constant  scale  factor  was  u,sed.  As  can  be  seen,  sometimes  a  very 
good  match  to  the  data  is  achieved  and  sometimes  it  appears  to  be  offset  by  a  few  dB.  The  results 
deteriorated  as  the  spacing  between  the  spiral  antenna  and  the  plate  increased.  The  worst  results 
occurred  with  vertical  polarization  at  the  maximum  spacing  of  6.3  inches,  which  is  shown  in  Figure 
37.  These  calculations  were  all  done  at  0.7  GHz  and  should  be  compared  to  the  distributed  source 
calculations  shown  in  Figures  20  through  25.  The  results  at  0.5  and  0.9  GHz  were  similar,  with  some 
good  matches  and  some  poor  matches. 
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Fig.  32  Experimeatal  versus  NECBSC  with  interpolated 
source,  0.7  GHz,  plate  0.3  inches  below  spiral,  horizontal 
polarization,  azimuth  cut  at  60  degrees. 
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Fig.  33  Experimental  versus  NECBSC  with  interpolated 
source,  0.7  GHz,  plate  0.3  inches  below  the  spiral,  vertical 
polarization,  azimuth  cut  at  60  degrees. 
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Fig.  34  Experimental  versus  NECBSC  with  interpolated 
source,  0.7  GHz,  plate  2.2  inches  below  the  spiral, 
horizontal  polarization,  azimuth  cut  at  60  degrees. 
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Fig.  35  Experimental  versus  NECBSC  with  interpolated 
source,  0.7  GHz,  plate  2.2  inches  below  the  spiral,  vertical 
polarization,  azimuth  cut  at  60  degrees. 


Angle  (Degrees) 


Fig.  36  Experimental  versus  NECBSC  with  interpolated 
source,  0.7  GHz,  plate  6.3  inches  below  the  spiral, 
horizontal  polarization,  azimuth  cut  at  60  degrees. 
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Fig.  37  Experimental  versus  NECBSC  with  interpolated 
source,  0.7  GHz,  plate  6.3  inches  below  die  spiral,  vertical 
polarization,  azimuth  cut  at  60  degrees. 
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VIIL  Conclusions 


The  distributed  source  technique  did  a  reasonably  good  job  modeling  the  planar  spiral  antenna 
on  a  small  ground  plane.  Results  were  better  at  the  higher  frequencies.  It  is  a  significant 
improvement  over  the  interpolated  source  technique  which  uses  a  single  point  source.  This  technique 
has  proven  useful  for  the  problem  of  modeling  antennas  on  large  complex  structures.  It  can  easily  be 
applied  to  other  antenna  types. 
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Abstract  Generalized  Multipole  Tedmique  has  been  adjusted  for  analysis  of  wafe 
contaminations  inspecting  systems.  This  technique  has  enabled  to  mvesti^  such 
complete  mathematical  models  as  particles  doited  on  a  smooth  substrate  surface  and 
pits.  Some  computer  simulation  results  were  discussed. 


Advance  inspection  technologies  should  have  reliable  detectors  of  micro-contamiimhons 
down  to  0. 1  microns.  To  establish  morphology  of  micro-contamination  it  is  essaitial  iwt 
only  particles  detection  but  recognition  a  particle  from  a  pit  and  reoonstru^on 
particle  size  and  material  also.  During  last  decade  much  effort  has  been  direct^  to 
improve  wafer  surface  scanners,  which  generaUy  make  use  of  laser  light  (wavelength  is 
either  633  nm  or  488  nm)  scattering  to  detect  contaminants. 


Solution  of  the  similar  problems  inqxwsible  without  of  the  complete  mathematical  model 
analysis  of  and  computer  simulation.  The  other  reason  consists  of  improvemOTt 
resolution  ability  of  existing  technology  equipment  to  provide  high  signal  to  noise  ratia 
To  increase  resolution  ability  of  existing  technology  equipment  or  to  extras 
contaminant’s  material  it  is  necessary  to  have  efficient  mathematic  and  computer  tools 

for  simulation. 


Therefore  calculation  of  light  scattering  from  particles  deposited  on  suifac^  or  pits  is  of 
great  interest  in  the  simulation,  development  and  calibration  of  contami^M-iiBi^on 

Scanner.  Amember  of  studies  has  addressed  this  problem  with  the  use  of  widely  <^«mg 

methods.  The  most  general  methods  such  as  firatcdifference  or  boundary  integr^ 
equation  technique  are  very  consume  and  demand  to  use  super  computers.  The  m^ 
difficulty  of  this  techniques  consists  of  the  necessity  to  take  into  accourt  unb^ded 
substrate  surface  presence.  In  this  report  we  suggest  Generalized  MuWHe 
(GMT)  for  a  light  scattering  analysis  by  a  particle  or  pit  on  a  smooth  surface.  GMT 
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be«i  origiiially  used  to  compute  EM  waves  scattering  ftxan  axi-symnietric  body  buried 
undergrcnmd  [1]. 

Complete  mathematical  model  for  EM  scattering  of  laser  beam  consists  of  Maxwell 
system  inside  and  outside  of  the  particle,  Silver-Muller  radiation  conditions  at  the 
infinity  and  conjunction  conditions  on  the  boundaries  of  media  discontinuous.  In  the 
fiame  of  GMT  a  scattoad  field  is  represented  as  superposition  of  multipoles'  fields.  It 
satisfies  to  Maxwell  equations,  radiation  conditions  and  conjunction  conditions  on  the 
smooth  unbounded  substrate  surface  analytically. 

We  ^KHild  repress  exciting  field  -  P/S  plane  wave  in  the  same  manner,  satisfying  these 
coiyuncticMi  conditions  also.  Muhipole  amplitudes  need  to  be  determined  from  the 
boundary  condition  at  the  local  obstacle  surface  only.  Under  investigation  axi- 
symmetric  obstacles  to  simplify  the  boundary-vahie  scattering  problem  we  employ 
Fourio'-series  expansion  both  for  scattering  and  exiting  fields  for  reduction  space 
scattering  problem  to  the  set  of  the  ones  at  the  azimuthal  half-plane. 

We  use  the  special  constructicni  fix’  EM  fields  of  muhipoles  taking  into  account 
Sommerfeld  integrals  also.  Multipoles  are  located  in  the  complex  plane  joining  to  axis  of 
symmetry. 

The  singular  part  of  fields  can  be  re|w*esented  as  finite  linear  combination  of  a  primary 
functions.  Muhipole'  amplitudes  for  each  harmonic  should  be  determined  as  a  pseudo 
solution  of  overdetermined  linear  system  received  fi-om  point-matching  approach  at  the 
meridian  of  the  obstacle  (particle  or  pit).  This  allows  to  get  a  minimum  of  mean  square 
norm  for  Fourier  hannonics  of  boundary  values  of  the  EM  fields  at  the  meridian  of  the 
obstacle  and  provides  the  results  stability  [2]. 

The  efficiency  of  the  GMT  allows  to  employ  code  created  for  PCs.  One  is  able  to 
investi^te  arbitrary  shaped  obstacle  having  arbitrary  refi-active  index  and  diameter  up  to 
3  incideiit  wavelengths.  Furth«more,  calculation  has  been  realized  to  examine  any 
angular  incidence  and  for  P/S  polarization  simultaneously.  A  possibility  to  estimate  the 
error  of  the  result  obtained  has  been  realized  as  well. 

We  carried  out  analysis  of  light  scattering  by  particles  of  different  size  and  matter 
deposited  on  the  smooth  surface  of  bare  silicon.  We  used  Tencor  Surfscan  -4000  and 
5500  as  the  models  of  inspecting  system  [3]. 
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Flo.  1 

We  investigated  many  kinds  of  particles  both  metallic  and  dielectnc.  We  cc«^>ared  our 
resuhs  with  the  calibrate  data  for  polistirol  particles  and  with  the  ejq)eriinental  results 
especially  for  a  small  ones. 


EiSL2 


We  carried  out  investigation  of  light  scattering  by  particles  and  pits  locating  on  a  layer 
deposited  on  the  smooth  surface  also.  Our  results  seem  to  be  usefiil  to  establish 
differences  between  scattering  pattern  of  a  particle  and  pit.  In  Figs  1-3  the  experimental 
data  obtained  from  system  Tencor  Surfecan-5500  are  shown  as  compared  vnth  the 
numerical  ones.  Vertical  axis  denotes  the  scattering  cross  section  (pm^)  and  horizontal 
one  is  the  particle  diameter  (pm).  Fig.  1  demonstrates  case  of  polistirol  particle,  Figs  2,3 
c(Hrespcmd  to  SUicxHi  and  Si02  particles. 

CcMidiiskHL  GMT  [2]  seemed  to  be  effective  tool  to  improve  resolution  ability  of 
contaminations  inspecting  system  by  computer  simulation.  Code  created  allows  to 
investigate  complete  mathematical  m^ls  by  PC  using  only.  Results  obtained  enable  to 
distinguish  a  particle  from  a  pit. 
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Boundary  integral  methods  provide  a  promising  approach  to  solution  of  challenging  industrial 
CEM  problems.  In  particular,  the  electric  field  integral  equations  are  often  in  use  since  they  can 
be  applied  both  to  closed  and  open  perfectly  conducting  scatters  with  and  without  (multilay¬ 
ered)  dielectric  coverings.  Unfortunately,  this  approach  mostly  based  on  the  Rao-Wilton-Glisson 
method  (in  its  current  state  of  the  art)  possesses  two  principal  drawbacks: 

(A)  it  is  necessary  to  solve  very  large  dense  complex  simultaneous  equations  which  requires 
enormous  memory  and  arithmetic  costs  (even  taking  into  account  recent  advances  in  iterative 
algorithms  for  solving  dense  linear  systems); 

(B)  the  Rao-Wilton-Glisson  possesses  the  very  slow  convergence  especially  when  solving  CEM 
problems  with  GHz  frequencies  for  very  large  electric  sizes,  it  is  commonly  adopted  that  the  size 
of  the  linear  systems  encountered  is  strictly  linearly  dependent  on  the  frequency. 

Besides  these  two  well  recognized  drawbacks  we  would  like  to  mention  another  very  negative 
disadvantage  of  the  Rao-Wilton-Glisson  method: 

(C)  it  is  very  hard  to  believe  that  in  the  framework  of  the  current  state  of  the  art  of  the  Rao- 
Wilton-Glisson  method  it  is  possible  to  prove  a  constructive  convergence  rate  theorem  providing 
a  reliable  and  computable  aposteriori  integral  solution  error  estimates. 

Therefore,  we  need  an  alternative  to  the  Rao-Wilton-Glisson  method  which  is  free  of  drawbacks 
(A)  -  (C).  Our  results  are  related  to  construction  of  such  an  alternative. 

We  begin  with  some  model  2D  CEM  problems  including  E-  and  H-  polarization  problems  for 
perfect  conducting  bodies  with  and  without  dielectric  coverings.  Following  the  Galerkin  approach 
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in  order  to  solve  an  operator  ecpiation  Au  =  /  we  use  the  linear  system  in  the  form 


i  =  I,---,  A^, 


j=i 


with  a  suitable  choice  of  the  test  functions  and  the  scalar  product.  We  propose  to  construct 
the  hierarchical  test  functions  from  a  family  of  basic  functions  defined  on  a  reference 

support.  In  this  case  the  functions  <^i  can  be  defined  from  V’i  via  a  natural  change  of  variables 
in  order  to  translate  the  basic  function  to  a  prescribed  local  support.  Thus,  we  construct  a 
hierarchical  family  of  supports  =  {w,}.  Any  test  function  is  supplied  with  two  indices  / 
(referring  to  the  set  of  basic  functions)  and  m  (referring  to  the  set  of  supports).  In  general  'P  and 
n  can  be  obtained  by  hierarchical  rules. 

When  using  the  segment  [0,27r]  as  a  reference  support  we  define  the  family  of  basic  functions 
as  follows 

Level  0: 


t  e  [0,27r], 
t  i  [0,27r]; 


Level  I: 


=  cost  •/o(0, 

=  sint  •  /o(t), 


Level  /: 

=  cos  It-foit), 

=  sin  It  ■  /o(t), 

The  approximate  solution  u  to  u  on  the  reference  support  is  sought  then  in  the  following  form 

i=i 

which  can  be  considered  as  a  truncated  Fourier  series  multiplied  by  the  cut-off  function.  The 
equation 

41»=//-(2.^). 

specifies  the  translation  of  the  basic  functions  to  an  arbitrary  support  [a,  6]. 

To  give  a  flavor  of  what  kind  of  the  numerical  results  were  obtained  we  present  numerical 
results  for  the  E-polarized  wave  scattering  problem  for  a  perfectly  conducting  cylinder.  In  this 
case  the  electric  field  integral  equation  takes  the  form 

f  1 

iiofi  I  - - J*  ds  ~  —E' 

J  Att  r 


930 


describing  the  fact  that  the  tangential  component  of  the  total  electric  field  must  vanish  on  the 
scatterer  S. 

Table  1  presents  the  sizes  of  linear  systems  produced  by  the  hierarchical  approximation  method 
and  the  RWG  method  in  order  to  guarantee  a  prescribed  accuracy  (AC)  of  the  computed  solu¬ 
tion  with  respect  to  the  analytical  solution  for  sampled  values  of  the  electrical  size  {ES).  By 
definition,  ES  =  ka,  where  k  is  the  wave  number  and  a  is  the  cylinder  radius.  The  error  of 
the  computed  current  is  measured  in  the  C-norm.  We  are  usually  interested  in  the  relative  error 

—  lif — ^illc  .  100%,  where  x  is  the  computed  solution  while  ar,  is  the  analytical  solution.  It 

should  be  emphasized  that  for  both  methods  we  tried  to  get  least  size  linear  systems  which  provide 
the  desired  accuracy  (in  the  RWG  method  we  vary  only  the  number  of  supports  per  wavelength, 
while  in  the  hierarchical  method  we  vary  both  the  number  of  supports  per  wavelength  and  the 
number  of  levels  taken  up  for  each  support). 

Table  1:  Comparison  of  Sizes  of  Linear  Systems  for  the  Rao-Wilton-Glisson  method  and  the 
Hierarchical  Approximation  Method  when  Solving  the  E-polarization  Problem  for  the  Perfect 
Conductor 


The  Rao-Wilton-Glisson  method 

ES  1 

AC 

50 

100 

200 

^  5% 

600  (5%) 

2000  (5.2%) 

2500  (6%) 

^  1% 

1000  (1.2%) 

5000  (1.3%) 

10000  (2.8%) 

^  0.1%1 

10000  (0.3%) 

>40000 

>  60000 

1  The  hierarchical  method 

ES 

AC 

50 

100 

200 

5% 

120 

300 

560 

«  1% 

150 

500 

630 

«  0.1% 

208 

720 

810 

The  results  presented  show  that  the  hierarchical  approximation  method  allows  us  to  reduce 
substantially  the  size  of  linear  systems  (in  contrast  to  the  RWG  method)  especially  for  large 
electric  sizes  (high  frequencies)  and  high  accuracies.  In  the  hierarchical  approximation  method  we 
need  to  use  about  3-5  unknowns  per  wavelength  to  obtain  an  accuracy  AC  less  than  1%  while  the 
RWG  method  requires  about  20-40  supports  (unknowns)  per  wavelength  to  get  the  same  accuracy. 

The  hierarchical  approximation  method  has  some  other  important  advantages.  Our  results 
show  that  the  accuracy  of  the  RWG  method  with  the  constant  number  of  supports  per  wave¬ 
length  can  vary  dramatically  depending  upon  the  electrical  size.  To  the  contrary,  the  hierarchical 
approximation  method  is  practically  insensitive  to  the  resonant  frequencies.  Figure  1  shows  the 
relative  errors  corresponding  to  the  first  and  to  the  second  levels  of  the  hierarchical  test  functions. 
The  relative  error  for  the  first  level  is  quite  similar  to  that  of  the  RWG  method  ,  the  latter  is 
shown  on  Fig.  2.  However,  the  relative  error  for  the  second  level  is  almost  constant  and  is  almost 
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independent  of  the  electrical  size.  The  detailed  description  of  the  numerical  comparison  of  the 
hierarchical  method  with  the  RVVG  method  can  be  found  in  [2], 

The  primary  observation  that  follows  from  our  numerical  experiments  consists  in  the  justi¬ 
fication  of  the  cost-effectiveness  and  robustness  of  the  hierarchical  approach.  Even  for  smooth 
bodies  and  moderate  accuracy,  our  method  significantly  outperforms  the  RWG  method,  not  to 
say  about  cases  when  a  high  accuracy  is  needed.  The  hierarchical  approach  generally  has  the 
exponential  convergence  rate  with  respect  to  levels.  Moreover,  this  method  seems  to  manifest  the 
exponential  convergence  with  respect  to  the  number  of  unknowns,  i.e.  the  linear  system  size.  VVe 
can  thus  conclude  that  the  hierarchical  approach  enables  us  to  reach  very  high  accuracy  on  small 
linear  systems.  We  have  also  shown  that  the  hierarchical  approach  can  be  successfully  applied  to 
stretched  out  and  nonsmooth  scatterers. 

One  should  appreciate  it  that  the  3Z)  vector  boundary  integral  equations  are  much  harder  to 
analyze  than  the  scalar  equations  of  the  2D  case.  All  the  theory  necessary  for  work  with  the  2D 
case  was  well-known  for  years  whereas  a  general  theory  of  the  ZD  electric  field  integral  equations 
has  been  developed  just  recently. 

For  the  2D  case  we  are  mainly  interested  to  prove  the  quasiexponential  convergence  rate  of 
our  hierarchical  schemes.  Let  a  sequence  ii^  converge  to  u.  Then  the  convergence  rate  is  called 
quasiexponential  if  for  any  s  >  0  there  exists  c^  >  0  such  that 

II  G 

Un  -  U  <  - . 

n’ 

We  show  that  the  quasiexponential  convergence  rate  is  guaranteed  whenever  the  solution  possesses 
some  regularity  properties. 

Consider  a  closed  continuous  piecewise  smooth  curve  L  with  corner  points  a;,  and  and  introduce 
a  class  of  functions 

rn 

t=i 

where  a,  >  a  are  prescribed  exponents. 

Definition,  A  set  of  functions  associated  with  a  covering  U  =  of  L  is  called 

a  pseudopartition  of  unity  if  it  satisfies  the  following  requirements; 

(a)  is  infinitely  smooth  on  the  closed  interval  which  is  mapped  onto  the  closure  of  (/,; 

(b)  is  nonzero  in  Ui; 

(c)  (i  is  allowed  to  take  the  zero  value  of  a  finite  multiplicity  at  the  end  points  of  (/,; 

(d)  is  equal  to  zero  outside  the  closure  of  Ui. 

Define  the  approximation  space  7^  as  the  set  of  all  functions  v  of  the  form 

y  =  Epf'Cr,,.., 

where  is  a  trigonometric  polynomial  of  degree  n. 

Theorem.  Let  u  e  and  C  G  for  all  i.  If  o  <  A:  -f  ^  then  for  any  s  >  0  there  exists 
Cj  >  0  such  that  for  all  n  >  1 

inf 
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i.e.,  the  Galerkin  method  converges  quasiexponentially. 

Corollary.  For  any  smooth  closed  and  for  any  smooth  open  screen  the  Galerkin  method  with 
the  approximation  spaces  converges  quasiexponentially. 

The  Galerkin  schemes  for  3D  problems  can  be  introduced  quite  similarly  to  the  formulations  of 
the  2D  case.  The  only  but  principal  difficulty  arises  that  the  operator  we  now  deal  with  is  no  longer 
strongly  elliptic.  As  a  consequence,  for  3D  electromagnetic  diffraction  problems  the  approximation 
property  of  test  functions  is  still  necessary  but  no  longer  sufficient  to  provide  the  convergence.  It 
now  comes  no  surprise  that  some  Galerkin  algorithms  useful  for  many  industrial  applications  (in 
particular,  the  Rao-Wilton-Glisson  method)  have  got  no  rigorous  proof  of  convergence  yet. 

A  relevant  tlieory  of  the  electric  field  integral  equation  has  been  proposed  just  recently  in  [1]. 
It  makes  it  possible  to  propose  some  projection  methods  and  present  a  proof  that  these  methods 
are  guaranteed  to  converge. 

The  Diagonal  Galerkin  Method.  Let  the  test  functions  vj  and  vf  are  chosen  such  that 

div  v]  =  0,  vf  -  grad  h, 


for  some  scalar  function  hi. 
Let 


ul  e  span 
uf  e  span 


Then  the  diagonal  Galerkin  method  is  defined  by  the  following  equations 


\  =  ifyvf),  z  = 

Theorem.  If  the  electric  field  integral  equation  has  a  unique  solution  then  the  diagonal 
Galerkin  method  is  guaranteed  to  converge. 

Corollary.  For  a  smooth  closed  screen  the  convergence  rate  for  the  hierarchical  p-version  is 
quasiexponential. 

Actually  the  theorem  states  that  there  exists  a  Galerkin  method  which  is  convergent  in  the 
cases  of  closed  and  open  screens.  Such  a  Galerkin  method  requires  the  choice  of  specific  test 
functions  with  above  described  properties.  Moreover,  this  theoretical  study  provides  a  principal 
key  to  an  analysis  of  convergence  properties  for  other  sets  of  test  functions  (and  ultimately  to 
choose  ’’optimal”  test  functions)  which  appear  to  be  similar  in  a  sense  to  the  above  described 
functions. 

To  the  best  of  our  knowledge  a  rigorous  mathematical  proof  of  convergence  of  a  Galerkin 
method  for  the  3D  EFIE  was  not  published  so  far.  The  detailed  description  of  such  a  proof  can 

be  found  in  [3].  n  j  • 

It  should  be  noted  that  in  spite  of  the  lack  of  rigorous  proofs  there  are  some  well-designed 
Galerkin  schemes  which  usually  behave  as  convergent  in  industrial  applications.  We  propose  a 
method  which  allows  one  to  explain  why  these  methods  converge  in  practice.  We  propose  in  fact 
a  new  pseudoprojection  method  which  we  call  the  r-projection  method.  Under  rather  genera! 
hypotheses  we  prove  that  the  r-projection  method  is  guaranteed  to  converge  for  a  continuously 
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invertible  operator  A  provided  that  0  <  r  <  1/1[/4“’||.  We  believe  this  result  opens  up  a  way  to 
produce  some  convergence  rate  estimates  for  the  Galerkin  method  applied  to  3D  vector  ecjuations. 

Shortly  we  a  going  to  present  some  numerical  results  for  the  3D  case.  Consider  the  diffraction 
problem  on  the  metal  plate  of  the  rectangular  form  [— 7r,7r]  X  [— 7r,7r]  on  the  plane  of  coordinates 
X  and  y.  Let  the  incident  field  be  E  =  E^e^  -f  Ey^,  where  \E\  =  1  and  arctg^  =  20°.  The  wave 
number  is  equal  to  5.  For  each  vector  component  of  the  unknown  current  we  use  4  supports  along 
the  vector  direction  and  20  supports  on  the  transverse  direction,  the  trigonometric  polynomials 
being  employed  only  for  one  direction.  Totally  we  have  160  supports.  What  is  interesting  to  look 
at  is  the  following  convergence  history  of  the  hierarchical  method: 

The  Convergence  History  of  the  Hierarchical  Method 


Level 

0 

1 

2 

3 

System  Size 

160 

480 

800 

1120 

Error  (%) 

— 

118 

12 

8 

The  RWG-like  method  (which  exploits  quadrilateral  supports)  for  the  same  problem  demon¬ 
strated  the  following  convergence  history: 

The  Convergence  History  of  the  RWG-like  Method 


System  Size 

420 

760 

924 

1104 

1300 

1512 

1740 

Error  (%) 

— 

16 

21 

29 

49 

28 

15 

We  thus  see  that  in  contrast  to  the  RWG  method  the  hierarchical  method  exercises  the  mono¬ 
tone  decrease  of  the  error  estimate  and  allows  one  to  deal  with  smaller  linear  systems. 
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Figure  1:  Accuracy  of  the  Hierarchical  Approximation  Method  with  the  Fixed  Number  of  Sup¬ 
ports  per  Wavelength  versus  the  Electrical  Size  when  Solving  the  E-polarization  Problem  for  the 
Perfectly  Conducting  Circular  Cylinder. 


Figure  2:  Accuracy  of  the  RWG  Method  with  the  Fixed  Number  of  Supports  per  Wavelength 
versus  the  Electrical  Size  when  Solving  the  E-polarization  Problem  for  the  Perfectly  Conducting 
Circular  Cylinder. 
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Abstract 

In  this  paper  we  discuss  research  trends  on  the  finite- difference  time- domain 
(FDTD)  algorithm  in  optics  and  present  the  types  of  devices  whose  design  and 
analysis  can  benefit  from  the  use  of  this  method.  The  FDTD  method  is  a  versatile 
and  powerful  method  for  the  analysis  of  electromagnetic  wave  interactions.  It  is 
well  suited  for  the  analysis  of  compact  geometries  having  strong  wave  interactions 
or  having  weak  but  extended  interactions  that  can  add  up  coherently.  Although 
the  method  offers  many  advantages  over  the  existing  optical  guided  wave  theories, 
the  optical  size  of  most  optical  devices  requires  enormous  computational  resources 
that  makes  the  FDTD  analysis  impractical.  This  paper  discusses  current  efforts  on 
the  improvement  in  the  efficiency  of  the  method  through  modifications  to  the  full 
vector  formulation  and  on  the  extension  of  the  FDTD  algorithm  to  more  complex 
media  with  dispersive  and  nonlinear  propagation  properties. 

1.  Introduction 

As  the  physical  size  of  optical  devices  becomes  more  compact  and  with  increasing 
computational  power  of  new  workstations,  the  FDTD  method  appears  to  be  an  attrac¬ 
tive  solution  to  the  analysis  of  these  devices.  However,  most  current  optical  devices  and 
structures  are  hundreds  and  even  thousands  of  wavelengths  long,  which  makes  it  imprac¬ 
tical  to  analyzed  them  by  FDTD.  The  large  size  requirement  for  optical  structure  is  due 
mainly  to  the  weakly  guiding  nature  of  the  optical  waveguides  and  the  relatively  weak 
dispersion  of  most  optical  waveguides.  In  the  design  of  optical  devices,  it  is  necessary 
to  limit  the  total  loss  to  satisfy  the  power  budget  requirement  for  the  optical  system  to 
be  practical.  Excess  loss  in  coupling  between  the  devices  and  waveguide,  at  bends  and 
interfaces  within  the  devices  cannot  be  tolerated.  The  reduction  of  the  total  loss  usually 
requires  the  devices  to  be  made  adiabatic,  with  very  slow  transitions.  For  example,  at  the 
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branching  section  of  devices  such  as  directional  couplers  and  power  splitters,  the  typical 
branching  angle  is  less  than  1°.  Since  the  fields  of  the  optical  waveguides  are  associated 
with  surface  waves  having  fields  extending  outside  of  the  core,  waveguides  must  be  sepa¬ 
rated  sufficiently  apart  for  them  to  be  isolated  from  each  other.  These  factors  can  make 
the  branching  section  of  the  device  alone  to  be  hundreds  of  wavelengths  long. 

Another  reason  for  the  large  size  requirement  is  because  of  the  relatively  weak  disper¬ 
sion  relationship  found  in  optical  structures.  For  example,  within  the  wavelength  band  of 
operation  of  a  wavelength  division  multiplexer,  typically  a  fraction  of  1  /xm,  the  relative 
change  of  the  coupling  length  is  typically  less  than  a  percentage.  In  order  to  increase 
the  wavelength  sensitivity  of  the  devices,  such  as  to  reduce  the  Unewidth  of  a  directional 
coupler  type  optical  wavelength  filter,  will  require  an  increase  in  the  optical  length  of  the 
device.  These  factors  make  the  optical  length  of  the  common  devices  to  be  in  the  orders 
of  thousands  of  wavelengths. 

Despite  the  fact  that  the  FDTD  method  seems  computer  intensive  in  the  analysis 
of  optical  devices,  the  method  draws  attention  in  optical  waveguide  research  because  it 
solves  certain  problems  that  are  difficult  to  be  solved  by  the  other  guided-wave  theories. 
In  the  following  section  we  will  review  some  of  the  more  popular  analysis  methods  in 
guided-wave  optics  and  discuss  on  problems  that  are  well  suited  for  FDTD  analysis.  In 
Section  3,  we  will  present  alternative  FDTD  algorithms  for  optics  which  demand  less 
computational  resources.  We  will  also  discuss  the  role  of  the  extended  FDTD  algorithms 
such  as  for  dispersive,  time-varying  and  weak  nonlinear  media  in  guided-wave  optics. 

2.  Optical  Guided- Wave  Analysis  and  Devices  for  FDTD 

In  order  to  show  how  FDTD  method  fits  into  the  rest  of  the  analysis  tools  in  guided- 
wave  optics,  it  is  necessary  to  have  an  understanding  of  the  characteristics  of  optical 
waveguides,  the  fabrication  limits,  and  a  knowledge  of  the  existing  numerical  analysis 
used  in  the  field.  The  previous  section  has  introduced  some  of  the  characteristics  of 
optical  waveguide,  here  we  will  start  by  briefly  discussing  some  of  the  existing  methods 
that  are  commonly  used  in  the  analysis  of  guided-wave  optical  problems. 

There  exists  a  number  of  methods  and  approaches  in  the  analysis  of  optical  guided- 
wave  problems.  Two  of  the  most  common  approaches  are  the  coupled-mode  theory  (CMT) 
[1]  and  beam-propagation  method  (BPM)  [2].  The  CMT  usually  considers  the  propagation 
of  only  the  two  dominant  guided  modes  while  neglecting  the  couphng  of  the  radiation 
modes.  The  theory  is  very  effective  in  most  adiabatic  situations  where  radiation  is  small 
and  it  is  traditionally  the  most  preferable  method  by  designers  and  experimentalists  in 
guided-wave  optics.  Another  method  that  has  gained  its  acceptance  in  guided-wave  optics 
during  the  last  few  years  is  the  beam-propagation  method.  The  BPM  is  a  numerical 
method  that  solves  the  one-way  wave  equation  in  the  spatial  domain,  the  simulation  is 
for  the  propagation  of  the  optical  signal  in  the  forward  direction.  Although  the  method 
assumes  that  the  reflection  effects  are  weak  and  can  be  neglected,  it  does  account  for  the 
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forward  radiation  fields.  It  is  a  good  approximation  for  many  optical  structures  and  the 
method  has  been  efficiently  applied  to  many  geometries.  It  is  important  to  realize  that 
the  analysis  of  adiabatical  structures  will  be  more  economical  using  these  approaches  than 
the  FDTD  method. 

The  FDTD  method  can  be  implemented  in  guided-wave  analysis  in  a  straight  forward 
manner  [3j  except  the  input  to  the  optical  devices  are  in  the  form  of  a  guided  mode,  and 
at  the  truncation  plane  the  medium  is  no  longer  homogeneous.  In  the  rest  of  this  section, 
we  will  discuss  the  recent  research  trends  of  FDTD  algorithm  in  guided-wave  optics. 
One  of  the  current  research  trends  concentrates  on  areas  where  the  FDTD  method  has 
a  degree  of  superiority  over  the  existing  methods  in  guided-wave  optics.  Some  of  the 
areas  are:  1)  Study  of  structures  having  strong  wave  interaction  and  can  cause  strong 
reflections,  some  of  these  are  sharp  bends,  corner  reflectors  and  mirrors  [4].  The  dimension 
of  these  problems  are  typically  within  a  few  tens  of  wavelengths,  however,  because  the 
junction  is  very  close  to  the  boundaries,  good  absorbing  boundary  conditions  are  needed 
to  absorb  both  the  guided  modes  and  the  radiation  fields  generated;  2)  Extended  FDTD 
algorithms  for  complex  media  with  dispersive  and  nonlinear  propagation  properties  (5- 
6].  The  recent  development  on  the  dispersive  FDTD  algorithms  can  be  applied  directly 
to  guided-wave  optics  in  the  study  of  pulse  propagation.  In  applying  these  algorithms 
a  good  understanding  of  the  optical  materials  used  in  optics  is  required.  Algorithms 
have  been  developed  for  propagation  in  nonlinear  materials,  from  solitons  propagation  to 
weak  nonlinear  propagations.  A  report  on  some  of  these  algorithms  will  be  presented;  3) 
Study  of  the  dynamic  behaviours  of  optical  devices.  These  require  FDTD  algorithm  for 
time  varying  media  for  the  study  of  optical  modulators,  photodetectors  and  sensors  [7-8]; 
4)  Study  of  optical  micro-cavities  and  resonators.  A  new  research  area  in  optics  is  on 
synthetic  process  of  optical  material  such  as  the  photonic  band  gap  structures  [9j.  The 
dispersive  characteristics  of  the  bulk  optical  materials  are  altered  by  embedding  micron 
size  micro-cavities  into  them.  The  small  size  of  the  cavities  makes  it  feasible  to  be  analyzed 
by  the  FDTD  method. 

3.  Alternate  FDTD  algorithms 

In  guided-wave  optical  analysis,  most  of  the  propagating  modes  are  linearly  polarized. 
This  is  because  of  the  weak  refractive  index  differences  An  found  at  the  core- cladding 
interfaces  in  the  waveguides  of  the  optical  devices.  It  is  more  efficient  to  develop  algorithms 
that  deal  with  the  propagation  of  only  the  dominant  field  component  for  these  devices. 
The  semi-vectorial  FDTD  algorithm  [10]  is  one  of  these  algorithms. 

The  E  formulation  of  the  semi-vectorial  FDTD  algorithm  starts  with  the  vector  wave 
equations  of  the  electric  field  E  in  a  sourceless  medium, 

-  v.-v|(v!),,b,|. 
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Let  Ey  be  the  dominant  field  component  and  from  the  y-component  of  (1)  we  have 


df^  dz"^  dy  e  dy  ^  dy  e  \6x  dz  J  ' 

The  last  term  in  (2)  corresponds  to  the  polarization  coupling  between  the  dominant  Ey 
held  wdth  the  minor  E^  and  E^  fields,  the  significance  of  this  coupling  depends  on  the 
magnitudes  of  the  minor  fields  and  the  derivatives  of  dcldx  and  dejdz.  In  most  optical 
guided-wave  analyses,  the  polarization  coupling  is  weak  and  one  can  apply  the  semi- 
vectorial  approximation,  which  neglects  the  polarization  coupling.  After  ebminating  the 
polarization  coupling  term  from  (2)  we  have  the  governing  equation  of  the  semi- vectorial 
FDTD  algorithm  for  the  Ey  field, 


^  df^  dx^  dz^  dy[edy^ 

Although  in  (3)  the  polarization  coupling  is  neglected,  it  does  not  eliminate  all  of  the 
vectorial  behaviours  of  the  wave  propagation.  The  last  term  of  (3)  properly  models  the 
boundary  condition  at  the  dielectric  interfaces  parallel  to  the  xz-plane,  normal  to  the 
polarization  direction.  The  semi-vectorial  FDTD  algorithm  can  be  derived  by  finite- 
differencing  (3),  and  the  usual  source  excitation  and  absorbing  boundary  conditions  sim¬ 
ilar  to  the  fuU-vector  FDTD  algorithm  can  be  applied  directly.  It  can  be  shown  that  the 
semi-vectorial  FDTD  algorithm  is  stable  if 

Vmax^t  <  ■■/■■■■■■■  i  (4) 


2er{iJ  ±  1,A;) 
k)  +  er(i,i  ±  1,  A;)’ 


and  Tinax  is  the  maximum  Tj±i  in  the  region,  Vmax  is  the  maximum  possible  phase  velocity 
in  the  region.  Since  the  semi-vectorial  FDTD  algorithm  deals  only  with  a  single  field 
component,  both  the  computation  time  and  memory  requirements  are  drastically  reduced. 
It  is  important  to  point  out  that  in  the  two-dimensional  case,  with  d/dy  =  0,  the  electric 
and  magnetic  fields  are  decoupled  and  the  semi-vectorial  formulations  for  the  Ey  and  Hy 
are  exact.  Using  the  semi-vectorial  FDTD  algorithm  for  two-dimensional  analyses,  a  50% 
reduction  on  computational  time  and  a  30%  reduction  on  memory  requirement  can  be 
achieved  without  any  lost  of  accuracy  as  compared  with  the  full- vector  FDTD  algorithm. 

An  important  development  of  the  FDTD  algorithms  in  optics  is  on  the  extension  of  the 
FDTD  algorithm  for  wave  propagation  in  complex  media  such  as  for  dispersive,  nonlinear 
and  time  varying  media.  These  extended  algorithms  are  generally  more  rigorous  than 
those  in  the  current  optical  waveguide  research.  If  the  algorithms  can  be  made  efficient, 
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they  will  have  an  impact  on  the  research  in  these  areas.  One  of  the  areas  where  the 
FDTD  method  has  successfully  implemented  is  in  nonlinear  optics.  Goorjian  and  Taflove 
modelled  the  propagation  of  femtosecond  electromagnetic  solitons  using  a  direct  time 
integration  approach  [6],  their  approach  is  very  general  and  can  be  used  to  study  solitons 
collision.  If  the  material  nonlinearity  is  weak,  the  nonlinearity  can  be  incorporated  into 
the  FDTD  analysis  directly.  Following  [11],  the  forms  of  the  nonlinear  permittivity  is 


€r 

where  in  the  Kerr  law  media, 


e.^-hf(alEn 


fin)  =  V 


(5) 


and  in  saturable  two-level  media, 


fiv)  = 


1  + 


A£r,,a«^ 


The  coefficient  of  nonlinearity  a  is  generally  many  orders  of  magnitude  less  than  unity 
but  it  does  scale  with  power.  It  is  convenience  to  set  a  to  unity,  resulting  in  nonlinear 
effect  at  low  power  levels.  If  the  features  being  studied  involve  only  the  field  patterns  at 
steady-state,  then  the  nonlinearity  can  be  incorporated  directly.  Assuming  the  magnitude 
of  the  electric  field  is  changing  very  slowly,  and  approximate  the  magnitude  from  the  field 
values  at  the  two  previous  time-steps.  Then  can  be  updated  for  the  next  time-step  using 
(5).  Using  this  method,  the  turn-on  time  of  the  nonlinearity  is  3At/2;  this  is  acceptable 
for  many  application.  The  above  approach  was  found  to  be  very  effective  in  modelling 
self-guiding  structures,  nonlinear  directional  couplers  and  sofitons  emission  and  capture. 
Some  of  these  results  will  be  reported  in  the  meeting. 


4)  Conclusion 


We  have  reported  on  the  current  research  trends  on  the  FDTD  algorithm  in  op¬ 
tics.  Due  to  the  size  of  most  optical  devices,  research  efforts  have  been  concentrated  on 
the  development  of  more  efficient  FDTD  algorithms.  We  have  presented  the  governing 
formulation  of  the  semi-vectorial  FDTD  algorithms  for  optics.  For  two-dimensional  anal¬ 
yses,  it  requires  50%  less  computational  time  and  30%  less  storage  than  the  full-vector 
FDTD  without  any  lost  of  accuracy.  We  have  also  presented  a  summary  on  areas  where 
the  FDTD  method  has  a  degree  of  superiority  over  the  existing  methods  in  guided-wave 
optics. 
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Abstract 

Waveguides  in  integrated  optical  circuits  can  be  described  as  multilayered  structures.  The  refractive  index 
is  varying  in  lateral  direction  and  is  different  from  one  layer  to  the  other.  The  analysis  of  such  structures 
by  the  method  of  lines  is  demonstrated  using  an  alternative  unified  approach.  This  semi-analytical  finite 
difference  technique  is  both  highly  accurate  and  efficient  for  multilayered  optical  waveguides.  Loss  can  be 
taken  into  account  by  using  complex  refractive  indices.  Numerical  results  are  given  for  a  rib  waveguide 
and  compared  with  results  of  other  papers. 


1  Introduction 

Waveguides  in  integrated  optical  circuits  consist  of  a  number  of  composite  layers.  Two  typical  structures 
are  sketched  in  Fig.  1  and  a  general  model  for  a  multilayered  structure  is  given  in  Fig.  2.  The  method 
of  lines  (MoL)  is  very  efficient  for  the  analysis  of  these  types  of  optical  waveguides.  The  MoL  takes 
advantage  of  the  layered  structure  and  uses  discretization  only  as  far  as  necessary.  Finite  differences  are 
used,  but  only  in  one  direction  for  waveguides,  whereas  an  analytical  solution  is  retained  for  the  other 
coordinate.  This  semi-analytical  approach  yields  accurate  results  with  less  computational  effort  than  other 
techniques.  The  MoL  was  introduced  by  Russian  mathematicians  for  the  solution  of  partial  differential 
equations  and  developed  in  the  first  author’s  group  for  the  analysis  of  multilayered  waveguide  structures 
in  the  microwave  and  optical  regions  [l]-[7].  Strip-loaded  optical  and  dielectric  waveguides  [3]-[5]  as  well 
as  diffused  and  groove  guides  [6]  have  been  analysed  with  high  accuracy.  A  comprehensive  description  of 
the  MoL  is  given  in  [1]. 


Fig.  1:  Waveguides  in  integrated  optics 

(a)  Rib  (or  strip  loaded  film)  guide  (b)  two  coupled  channel  guides 

lu  this  paper  a  unified  approach  is  presented.  We  use  a  single  Hertz  potential  with  two  components 
instead  of  two  potentials  with  one  component  each  as  in  |1)  and  [3]-[6|.  This  enables  us  to  use  a  more 
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concise  mathematical  formulation  with  less  algebraic  manipulations  than  in  the  conventional  approach. 
Thus  the  analysis  is  easier  to  follow  and  the  computer  algorithms  are  readily  programmed. 


Fig.  2: 

General  analysis  model  for 
multilayered  waveguide  structures 


In  some  optical  structures  metallizations  arc  impor¬ 
tant,  too  (e.g.  electro-optic  modulators).  The  met¬ 
allization  may  have  a  finite  thickness  and  should 
be  assumed  as  dielectric  with  complex  or  imaginary 
permittivity.  In  ±2:  direction  the  structure  is  either 
infinite  or  closed  by  a  metallic  or  magnetic  wall.  In 
X  direction  metallic  or  magnetic  walls  are  used.  For 
modcUing  of  open  structures  absorbing  boundary 
conditions  have  been  introduced  [8]  [9]. 


2  Theory 

The  following  steps  are  essential  in  the  MoL  analysis  of  optical  waveguides; 

•  partitioning  of  the  cross  section  into  suitable  layers  according  to  the  model  in  Fig.  2 

•  discretization  of  the  wave  equation  in  one  coordinate  direction 

•  transformation  to  obtain  decoupled  ordinary  differential  equations  for  the  Hertz  potential 

•  solution  of  the  equations  and  determination  of  the  tangential  electromagnetic  fields  in  each  single 
layer 

•  transfer  of  the  fields  through  the  multilayered  structure  after  transformation  back  to  spatial  domain 

•  determination  of  the  propagation  constant  and  the  field  distribution  as  the  solution  of  an  indirect 
eigenvalue  problem 


2,1  Basic  equations 

For  the  analysis  we  use  the  Hertz  potential  He  defined  in  [10].  In  our  case  Hg  should  have  the  two 
components  H^  and  Hy  in  the  two  coordinate  directions  x  and  t/,  respectively.  This  is  analogous  to  the 
case  in  Cartesian  coordinate  system  described  in  [7]  [11]  [12],  The  permittivity  may  depend  on  the  x 
coordinate  too.  The  general  case  with  dependence  on  three  coordinates  is  described  in  [13].  The  Hertz 
potential  Hg  has  to  fulfil!  the  vector  wave  equation  [10]. 

v=ng-s7'(Vf)v-ne  +  s,ne  =  o  (2.1) 


7/,//  =  iVxIF  E  =  e^'VxVxllg  (2.2) 

The  detailed  formulae  are  given  in  the  Appendix  A. 

The  permittivity  for  this  problem  may  be  only  a  function  of  x:  =  fr(^)-  We  assume  propagation  in 

y  direction  according  to  exp(— j\/£v7y)  Therefore  we  have  —  =  —jy/sTe- 
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(2.3) 


We  have  to  solve  two  coupled  wave  equations 


^4  4i|nv)+  -  .c.)nv  -  (.4  ( ^nJ  -  ^nJ  =0 


‘’dx\e,ax  ‘j'  9a"  '  '  V  / 

d'^Tly  /  VJJ  Q  (2.4) 

in  which  5,  y  and  ?  are  the  coordinates  x,  y  and  z,  respectively,  normalized  with  the  free  space  wave 
number  ko-  We  obtain  the  field  components  by 

erE,  =  -jV^,  (2-5) 

(2.6) 


17  _ 


[  dx 

d  \d]u 


tjoH,  =  j  ^ny  +  jV^H: 


= 


These  equations  can  be  transformed  to  the  equation  for  TE^  and  1  M^,  for  £r  constant.  The  potential 
components  and  IT^  arc  connected  in  an  identical  way  with  the  field  components  and  Hy,  respec¬ 
tively. 


3  Discretization 

For  inhomogeneous  layers  not  only  the  potentials  but  also  the  permittivities  have  to  be  discretized.  The 
discretization  has  to  be  done  on  two  different  discretization  line  systems  and  yields 


Hx  - 

-  i7x 

(vector) 

li,  - 

-  By 

(v. 

£r  — 

(diagonal  matrix) 

£r  — 

^  €y 

(d 

d 

dx 

Dx  =  Dx 

d 

dx 

Dy 

iHx  - 

-  D, 

ilx 

-+  Dy 

Hy 

The  subscripts  on  the  difference  operators  D  indicate  for  which  potential  the  difference  operator  has  to 
be  used,  and  £,  are  diagonal  matrices.  The  subscripts  indicate  the  discretization  line  system  to  which 
the  quantities  belong.  The  discretized  wave  equations  run 

^n.+  [-e,DU-,'D,  -  e„I  +  €.]  m  +  ;v^  [C»  -  ^xD,e-']  n,  =  0  (3.2) 

£2^,+  [-0[Dy  -£../  +  €»)  U,  =  0  (3.3) 

The  two  potentials  U,  and  JI,  are  coupled  with  each  other  for  inhomogeneous  layers  and  the  potentials 
on  different  discretization  lines  are  coupled  as  well.  Using  an  adequate  notation  the  decoup  mg  proce  ure 
is  nearly  the  same  as  for  the  conventional  approach  [1],  We  would  like  to  combine  the  last  two  equations 
in  matrix  form.  Therefore  we  define  the  matrix  Q 

r  -  ex  [D y  -  e^D yC''^)  _  Qx  Q xy  ^3  4^ 

'^^Dy  +  ErJ-ey  0  Qy 


0 
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Fig.  3:  Discretization  of  a  rib  guide 


and  the  supervector 


/!=  [ili,  -jJT']' 


Tlic  hat  O  on  the  quantities  indicates  supervectors  and  supermatrices.  Now  the  equations  (3.2)  and  (3.3) 
can  be  written  in  a  shorter  form  ^ 

^n-Qn  =  o  (3.6) 


By  transformation  to  the  principle  axes  (diagonalization) 

f  ~^QT  =  n  =  tH  f  =  Diag  (r^,  r.)  t  =  ""  (3.7) 

0  Ty 

we  obtain  uncoupled  equations 

=  0  (3.8) 

where  Fj-^Tx  and  Fy^Ty  are  the  eigensolutions  of  and  Qy,  respectively.  The  eigenvector  submatrix 
Tj.y  is  obtained  from  these  as  the  solution  of  a  system  of  equations. 

The  general  solution  of  (3.8)  is 

FI  =  cosh  (TI  J  -  Zo))>l  +  sinh  (T(z  —  Jo))S  (3-9) 

We  obtain  for  the  relation  between  the  potentials  and  derivates  in  the  planes  K{z  =  z^)  and  B  (2  =  z^-^d)  : 


Ha 

S;2 

-7  a 

Ua 

=  F 

Hb 

-a  7 

Hb 

7  =  (Ttanhrd)  ^  a  =  (TsinliTd) 

With  the  definitions  for  the  fields  (analogous  to  eq.  (3.5)) 


F  =  Diag(r,r)  (3.11) 


-jHy 

E 

Ex 

H. 

.  . 
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and 


e  =  Diag(e^,  €y) 

'  \[^D.  -nlDy\  +  ‘^1= 

the  solutions  for  the  fields  transverse  to  the  2  direction  can  be  written  as 


E  =  -Rn 


H  =  -n 

az 


or  in  the  transform  domain 


f=-S/7  H^~n  R  =  T^RT 


The  relation  between  the  electric  and  magnetic  field  of  the  two  planes  can  be  described  by  the  equation 


-7  a  Ua 

-a  7  Ub 


-7  OL  \  \R 


R  Eb 


In  analogy  to  [1]  we  may  write 


Ha 

y\ 

^2 

[  Ea 

Hb 

1/2 

Vi  _ 

[ 

^2  22--1 

3/1  -  r  7i2 

3/2  = 

^2  22.-1 

r  ScR  . 

Now  we  can  completely  use  the  algorithm  developed  there  for  transfer  from  one  layer  to  the  other 
[1][6].  This  is  only  possible  without  modifications  if  the  layers  are  homogeneous  because  for  this  case 
the  transformation  matrices  are  equal  for  each  layer.  Otherwise  the  admittances  and  fields  have  to  be 
transformed  back  to  the  original  domain  and  the  algorithm  should  be  used  for  these  quantities  in  the 
original  domain. 

For  homogeneous  layers  we  obtain  the  simplified  equations 

^  _  1  [a*  -  y/TTei'  1  ^  f.-2  \  Cfl.  1  (3,19, 


£rIy-\ 


from  eqs.  (3.13)  and  (3.15),  where  -  £re,  6  =  For  y,  and  we  have 


3/1  =7^  1/2  = 

For  an  iiifiiiitely  thick  layer  we  obtain 


£d^x  \/£re^ 

_ 2 

\/^re<5  £rly  “ 


y  j  ==  f  yl  3/2  =  0 
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iNoiieciuidlsiaiil  Discretization 

[t  sIiomI'I  l)f'  (iK’nt  i(inc(!  flint  a  nonc(|ui(lisf  ant  disrrf't  i/al  ion  nan  l)o  inf  rodurfH],  too.  In  this  <'asf'  the 
|)o!('iitial  am!  l!i(’  (liffcrcma'  operators  are  normalized  wil  li  tiie  diagonal  iiiatricf's  rt'siill  iii.tt;  liotii  t  lie 
discrc't izat  ion  distama's  //j,  and  h,„.  We  d<'fii!e 

Sr  =  diap;(  -  dia,n;(  /h  n)  s  ~  dia.'j,'  5,J 

acrordin.t*  to  [1  1]  and  noriiia!iz(' 

n„^s~^n  H„=S~^H  E„--^s-^E 

Drr,  =  Sr.D,rS,r  D,,„  =  S.,jD,,S,  (d.'il) 

1  lie  whole  analysis  po'sented  in  sec  tion  3  rorinally  nmiains  1  lie  same'  lor  nomajiiidislant.  discretization  if 
I  iic'sc'  iHjrmaiized  (jiianl  it  ic's  arc'  nsc'd  instc'ad  of  tin*  (.irijiinai  one's. 

4  Results 


do  yive  an  example'  for  tlie  validity  of  t  he  new  approach  a 
si  ri|)-!oadf'd  slab  .collide  as  .e'ivc'ii  in  [laK-h]  has  hc'C'ii  anah 
ysed.  1  he  con vc'rgc'iicc'  curve  is  c'xactly  the  same'  as  for 
the  c<)iivent iotiaj  Mol,  apjiroach  [5].  d  lie  field  dis1ril)i!- 
lion  ti;ivc'n  in  I'ie.  1  (!(>ariy  exhibits  the  vc'clorial  nature' 
of  the'  fniielarnent al  lll’inn  mode  (epiasi  d'lf-nieKle' ).  [die 
main  ma^iic'tic  com[)e>iie'ii1  //.-  must  be'  iie'oyitlve'  with  re'- 
s[)e'cf  to  the'  tnaiii  e'le'cl  ric  eoiniHiiie'lil  E ,■  because*  t  he 
Peiynlitig;  ve'e-teir  S,^  ninsl.  be*  [lositive*.  d  he*  total  litini- 
be-r  eif  discre'tizat  iem  line's  tise'd  in  the*  ceirii  pe'iil  at  !e)n  is 
I  dO  for  e'acli  line'  svstem  (Vir  I  lie*  half  structure',  lor  a 
(  h'ar  re'jire'sent at ietn  the  fie'ld  is  de'jiie-led  within  the  taiiy;e' 
-Ml  <  ,r//)  <  +S()  only  in  I'i.i^.  i. 

.\s  a  se'coml  e'xamph'  the*  mieneiwave'  e'h'e'tric  field  distii- 
biition  in  tin*  (•re)ss  se'ctiecii  of  an  eh'ct  ro  eejitic  moeliilator 
is  prese'iite'd  in  I'dn;.  5.  d  his  mult ilayere'el  strnctnro  ceiii- 
lains  me'tallie-  elect roelcs  (black  in  Id^.  -h)  and  semicoii- 
diie-tecr  laye'is  (sliaeh'd).  ifotli  are*  moeh'lle'd  by  a  suitable' 
ceemph'x  pe'rmit  livity  c,-  =  s',.  —  jc''.  whe're'  r',  =  I  2. 1  for 
se'inicond  iictecrs.  5"  is  l.OlO’  !0'’  ami  1.01')  ■  ]  O'  for  loj) 
and  beittom  layers,  re'spe'ct ive'ly.  =  — y'T.lOd-  10^  for 
the*  eh'ct  rodes.  ddie*  optical  held  can  be  calculated  with 
I  lie  same'  cemipnter  pretgram.  .\ti;aiii  the  com fiiit at ieinal 
wiiielow  is  wide-r  on  t  h<’  lateral  sieies,  but  it  lias  be'en  cut 
at  tlie'  half  f'h'ct  re>de  width  fe»r  be'tte'r  re'|)re'se’ii  tat  ion . 


I' iu,.  d:  lnte'fi;rateel  optienil  waveguide  on  Inl* 
accoreliiiy  to  [  ]  .a]  ojie'ra  t  i  nt;,  at  A  "=  l.ao/em: 
ir  =  2.  !//m.  I  —  0//ni. 


•to  -f>  0  s  10 


X  [ntn] 


5  Summary 


I'd”,,  a:  Microwave  e'leef  ric  fie'ld  dist  ribiit ion  in 


.\ii  alternative  afijiroae  li  feir  the  ve'clenial  analysis  of  niiil- 
liia.ye'ie'd  optical  wa.vep,[iieles  has  be'en  [irese'iited.  d  he* 


I  lie  creiss  section  of  an  e'h'Ct  ro-ej[)t  ie-  iiK)d iilat eir 
e>M  In  1’  at  /  =  i0(ill/,. 
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Fig.  6;  Field  distribution  of  IIEqo  rnode  at  the  half  height  of  the  slab  layer, 
normalized  on  the  maximum  of  Ex.  The  shaded  area  corresponds  to  the 
strip  width  v). 

formulation  uses  a  different  Hertz  potential  than  the  conventional  approach  and  is  clearer  and  easier  to 
program.  Results  for  a  rib  waveguide  are  in  excellent  agreement  with  other  vectorial  approaches.  The 
microwave  field  distribution  of  an  electro-optic  modulator  has  been  given,  too. 
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Appendix 

The  complete  system  of  coupled  equations  reads  a.s  follows 


d  /  I  dn A  d^i 


dx 

dx^ 

dx^ 


7 - bTT  +  “5=^-  + 


dx 
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dy'^ 


1  fdu, 

£r  V  dy 


dx 


dn,  dUy 
dz^  dy  \ 


=  0  (Al) 
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Vector  Finite  Element  Analysis  of  Lossless  and  Lossy  Dielectric  Waveguides. 
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Abstract-  Recently  a  transverse  magnetic  field  formulation  of  the  finite  element  method  tor 
solving  lightwave  propagation  in  lossless  optical  waveguides  was  demonstrated  using  only  two 
components  of  the  magnetic  field.  We  extend  this  formulation  to  include  loss  in  the  waveguide, 
and  compare  the  results  from  this  formulation  with  those  of  other  formulation  for  a  lossless  and  a 
lossy  channel  waveguide.  Our  results  from  this  new  approach  agree  well  with  those  from 
previously  published  data.  The  advantage  of  this  new  formulation  is  that  it  does  not  require  the  use 
of  the  perturbation  technique  to  solve  for  the  loss  in  the  waveguide,  and  the  eigenvalue  matrix 
formulation  direedy  solves  for  the  complex  propagation  constant 

I.  Introduction 

In  the  finite  element  method  analysis  of  lossy  dielectric,  the  approach  is  to  use  the 
variational  formulation  to  solve  for  the  lossless  propagation  modes  of  the  guide  [1-4]  or  the 
frequency  [4]  with  a  perturbation  approach  involving  2  or  all  3  components  of  the  fields  tor  the 
complex  propagation  constant  Recently,  Abid  et  al  demonstrated  a  unique  approach  to  solve  for 
the  propagation  constant  of  lossless  dielectric  waveguides  using  two  components  (Hx  and  Hy)  of 
the  transverse  magnetic  field  with  no  spurious  modes.  In  this  formulation  e  (dielectric  constant)  is 
assume  to  be  piecewise  constant  and  the  coupling  between  Hx  and  Hy  components  ot  the  field  is 
imposed  through  the  interface  continuity  of  the  tangential  components  Ez  and  Hz.  In  this  paper  we 
extend  this  formulation  to  solve  for  the  complex  propagation  constant  of  lossy  dielectric 
waveguides.  The  advantage  of  using  this  method  is  that  it  does  not  require  the  use  of  the 
perturbation  method  and  the  eigenvalue  matrix  formulauon  directly  solves  for  the  complex 
propagation  constanL  We  compare  the  results  of  this  approach  to  published  results  for  a  lossless 
and  lossy  channel  waveguide.  The  results  obtained  agree  very  well  with  those  from  other 
techniques. 

This  paper  is  divided  into  three  parts.  For  the  purpose  of  completeness,  the  formulation  is 
presented  with  extension  to  complex  case  in  the  first  part.  In  the  second  part  we  discussed  the 
numerical  implementation  of  the  formulation  using  the  FEM.  In  the  third  section  we  present 
results  for  a  lossless  and  lossy  waveguide  with  comparison  to  published  results. 

II.  Formulation 

We  first  consider  a  harmonic  wave  propagating  in  the  z-direction  of  the  dielectric 
waveguide,  the  fields  are  given  by: 

E(x,y,z)  =  E(x,y)  expf-yz) 

H(x,y,z)  =  H(x,y)  expf-yz) 
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(1) 

(2) 


Ill.  Numerical  Implemenation 


To  implement  the  finite  element  solution  of  the  above  formulation,  the  dielectric  waveguide 
LS  dLscretiziid  into  2*N*M  right  angle  first  order  triangle  elements.  The  functional  of  equations  (7a 
and  7b)  is: 


F(Hj)  =  j|l[-VHi«VHi  +  k^eHiHi  +  Y^HiHi]}dxdy  (10) 

where  i  =  x,  y.  By  minimizing  the  functional  F(Hx,y)  with  respect  to  the  nodal  values  of  Hx  and 
the  Hy,  a  set  of  linear  equations  is  obtained,  the  problem  is  reduced  to  an  eigevalue  matrix  equation 
of  the  form : 


|S]IH|=-y2[T|[H|  (11) 

The  dimension  of  the  square  matrices  [S]  and  [T]  are  2[(N+l)(M+l}-(N+M+2)].  This  is  because 
Hx  and  Hy  are  set  to  zero  at  the  2(N+M)  external  boundary  nodes.  The  coefficients  of  the  [S| 
mau-ix  are  complex  and  each  column  of  fH]  is  the  eigenvector  representing  the  values  of  Hx  and 
Hy  at  the  nodes. 

The  matrix  equation  (1 1)  is  to  be  solved  in  a  subspace  of  vectors  which  satisfy  tlic  inner 
interface  boundary  conditions,  namely  the  condition  of  E/  and  Hy  across  the  common  side  of  any 
tw'o  adjacent  triangles  having  two  different  dielectrics.  The  continuity  of  Ey  and  Hy  is  first 
transformed  into  another  set  of  equations,  each  is  associated  with  one  side  of  a  triangle  along  the 
interface: 


r  1  f 

Ual  9x 


I  rSHyt 

^  JJ  Lebl 


(12a) 


dx  3y  J  ^x  dy  ^ 


(12b) 


Both  equations  are  normalized  by  a  multiplication  factor  equal  to  the  length  of  the  common 
interface  side.  The  subscripLs''a"  and  "b"  represent  the  adjacent  triangles  with  different  dielectric 
constants,  and  this  results  in  a  total  of  "r"  equations  of  type  (12).  This  set  can  be  written  in  matrix 
form: 


[R]fHl  =  ;VmfHl  (13) 

where  fl]  is  an  identity  matrix.  The  eigenvectors  of  (13)  associated  with  the  zriro  eigenvalues  are 
the  set  of  vectors  that  defines  the  subspace,  in  which  (10)  is  to  be  solved.  The  matrices  [R]  and  [H] 
have  the  same  dimensions  as  [S],  The  matrix  [R]  is  singular  since  it  has  only  "r"  non-zero 
rquations,  and  we  solve  (13)  by  Singular  Value  Decomposition.  The  solutions  sought  are  for  ?t=(), 
and  its  associated  eigenvectors  constitute  the  null  space  of  this  equation.  An  alternative  approach  is 
to  use  Gaussian  elimination  to  obtain  the  null  set  of  eigenvectors.  The  dimension  of  the  null  space 
is  (n-r),  and  the  vectors  of  the  new  basis  are  in  a  rectangular  matrix  [Z],  of  the  dimension  n  X  (n- 
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r).  Since  the  solutions  are  now  sought  in  the  null  space  of  (1 3) ,  the  new  vectors  [C]  are  related  to 
the  old  vectors  [H]  by: 

[H]=[Z][C]  (14) 

Substituting  (14)  into  (11),  then  multiplying  both  sides  by  [Z^]  results  in: 

(S^llCL  = -y2  [Tn][Cl  (15) 


where: 


[Sn]  =  [ZtlfS][Z]  (16) 

and 

[T^l  =  [Zt][T][Z]  (17) 

Equation  (15)  is  solved  for  7  and  fC],  then  [C]  is  mapped  to  the  original  space  through  the 
relations  in  (14). 

VL  Results 

We  apply  the  above  method  to  find  the  propagation  modes  of  a  lossy  channel  waveguide. 
Fig.  1  shows  the  dimension  of  the  a  channel  waveguide  used  for  the  simulation,  the  gain  curve  lor 
the  TE(1,1)  mode  of  the  guide  is  given  in  Fig.  2.  The  results  are  compared  to  those  from  reference 
1 1 J  and  shows  good  agreement  over  the  range  of  the  dimensions  of  the  guide  simulated.  In  our 
simulation,  the  gain  region  is  0.2  p.m  thick  with  a  width  of  1.0  p  m.  This  region  has  a  refractive 
index  of  ni  =  3.5  +  jO.OOl.  These  dimensions  corresponds  to  the  case  of  d  =  5w  in  reference  fl, 
4].  The  cladding  layers  has  a  refractive  index  of  n2  =  3.2  -  jO.OOT'  and  is  2.0pm  thick  and  5pm 
wide.  An  overview  of  the  magnitude  of  the  magnetic  field  distribution  overlaid  with  the  contour 
plot  is  given  in  Fig.  3  for  the  case  of  kod  =  3.  The  contours  are  set  at  levels  of  0.0  to  1.0  with 
intervals  of  0.1.  The  contour  plots  shows  that  the  mode  is  well  confined  within  the  gain  region  of 
the  guide  and  a  stable  solution  for  the  natural  mode  of  the  guide  is  obtained  using  the  present 
approach. 


V.  Conclusion 

We  have  extended  the  new  transverse  H-field  FEM  formulation  to  obtain  the  complex 
propagation  constant  of  a  lossy  dielectric  waveguides  to  include  loss  or  gain  with  no  spurious 
modes.  The  results  shows  good  agreement  with  published  data  for  a  lossy  channel  waveguide. 
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Since  the  solutions  are  now  sought  in  the  null  space  of  (13) ,  the  new  vectors  fC]  arc  related  to  the 
old  vectors  [H]  by: 

[HHZlfC]  (14) 

Substituting  (14)  into  (11),  then  multiplying  both  sides  by  [Z^]  resulLs  in: 

[SnjlCI  - -y2  (Tn][C]  (15) 

where: 

!Sn|  =  [Zt]fSJ[ZJ  (16) 

and 

[Tn]  =  iZtim[Zl  (17) 

Equation  (15)  is  solved  for  y  and  [C],  then  [C]  is  mapped  to  the  original  space  through  the 
relations  in  (14). 

VL  Results 

We  apply  the  above  method  to  find  the  propagation  modes  of  a  lossy  channel  waveguide. 
Fig.  1  shows  the  dimension  of  the  a  channel  waveguide  used  for  the  simulation,  the  gain  curve  for 
the  TE(1,1)  mode  of  the  guide  is  given  in  Fig.  2.  The  results  are  compared  to  those  from  reference 
fl]  and  shows  good  agreement  over  the  range  of  the  dimensions  of  the  guide  simulated.  In  our 
simulation,  the  gain  region  is  0.2  fim  thick  with  a  width  of  1.0  pm.  This  region  has  a  refractive 
index  of  nj  =  3.5  +  jO.OOl.  These  dimensions  corresponds  to  the  case  of  d  =  5w  in  reference  1 1, 
4].  The  cladding  layers  has  a  refractive  index  of  n2  =  3.2  -  JO.OOl"  and  is  2.0pm  thick  and  5pm 
wide.  An  overview  of  the  magnitude  of  the  magnetic  field  distribution  overlaid  with  the  contour 
plot  is  given  in  Fig.  3  for  the  case  of  kod  =  3.  The  contours  are  set  at  levels  of  0.0  to  1.0  with 
intervals  of  0.1.  The  contour  plots  show's  that  the  mode  is  well  confined  within  the  gain  region  of 
the  guide  and  a  stable  solution  for  the  natural  mode  of  the  guide  is  obtained  using  the  present 
approach. 

V.  Conclusion 

We  have  extended  the  new  transverse  H-ficld  FEM  formulation  to  obtain  the  complex 
propagation  constant  of  a  lossy  dielectric  waveguides  to  include  loss  or  gain  witli  no  spurious 
modes.  The  results  shows  good  agreement  with  published  data  for  a  lossy  channel  waveguide. 
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Fig.  1 :  Channel  waveguide  with  dimensions  w  =  1.0  jim,  d  =  2.0  |im  for 
the  gain  region  and  wl  =  5.0  jim  and  dl  =  1.0  pm  for  the  cladding. 
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With  the  continuing  and  heightened  interest  in  linear  and  nonlinear  optically  integrated 
devices,  more  accurate  and  realistic  mimerical  simulations  of  these  devices  and  systems  are 
in  demand.  Such  calculations  provide  an  integrated  optics/photonics  testbed  in  which  one 
can  investigate  new  basic  and  engineering  concepts,  materials,  and  device  configurations 
before  they  are  fabricated.  The  time  from  device  conceptualization  to  fabrication  and  test¬ 
ing  should  therefore  be  enormously  improved  with  numerical  simulations  that  incorporate 
more  realistic  models  of  the  linear  and  nonlinear  material  responses  and  the  actual  device 
geometries.  It  is  felt  that  vector  and  higher  dimensional  properties  of  Maxwell’s  equations 
that  are  not  currently  included  in  existing  scalar  models,  in  addition  to  more  detailed 
materials  models,  may  significantly  impact  the  scientific  and  engineering  results. 

We  have  been  simulating  a  variety  of  linear  and  nonlinear  corrugated  waveguiding  sys¬ 
tems  for  their  applications  to  integrated  optics  systems.  Corrugated  waveguide  structures 
have  many  potential  uses  as  beam  steerers  and  grating  assisted  couplers.  We  are  developing 
a  simulation  toolbox  that  eventually  will  be  used  to  design  these  and  many  other  integrated 
optical  devices.  To  meet  self-imposed  design  goals  that  specify  integrated  optical  devices 
that  are  only  a  few  wavelengths  or  pulse  lengths  in  size,  we  require  a  thorough  under¬ 
standing  of  the  basic  physics  that  we  are  modeling  without  the  typical  approximations 
generally  used  for  this  class  of  problems.  This  in  turn  has  required  our  simulations  to  be 
based  upon  numerically  solving  the  full-wave,  vector  Maxwell’s  equations.  We  have  shown 
that  this  approach  leads  to  a  superior  understanding  of  the  underlying  physics  and  to  im¬ 
proved  engineering  designs'"'*.  These  numerical  solutions  have  been  obtained  in  two  space 
dimensions  and  time  with  a  nonlinear  finite  difference  time  domain  (NL-FDTD)  method 
which  combines  a  generalization  of  a  standard,  FDTD,  full-wave,  vector,  linear  Maxwell  s 
equations  solver  with  a  Lorentz  linear  dispersion  model,  a  nonlinear  Raman  model,  and 
an  instantaneous  Kerr  nonlinear  model.  In  particular,  we  are  solving  in  a  self-consistent 
manner  the  system  of  equations: 
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^  l/io  H]  =  -V  X  £ 


(1) 


(2) 

_  Y 1  —  =  €0  Xo  E  Lorentz  Model 

(3) 

d" 

dt^ 

+  ^R  X^^  +  X^^  =  <^'r  \E\^  Raman  Model , 

(4) 

where  P  =  +  P^'^^  and 

pA'i  =  eo  ^  +  eo  |S|^  ^  ,  (5) 

the  last  term  representing  the  instantaneous  Kerr  nonlinearity,  being  the  instanta¬ 

neous  Kerr  susceptibility.  The  resulting  NL-FDTD  simulator  can  model  pulse  propagation 
in  complex  environments  under  the  influence  of  linear  and  nonlinear  dispersive,  linear  and 
nonlinear  diffractive,  and  time  retardation  effects  of  the  materials  in  and  surrounding  the 
electromagnetic  structures.  By  coupling  the  linear  and  nonlinear  dispersion  models  to¬ 
gether  simultaneously  with  the  natural  boundary  conditions  arising  from  dielectric  and 
metallic  discontinuities,  we  are  able  to  handle  the  gratings  and  corrugated  interfaces  read¬ 
ily.  Moreover,  both  the  TE  and  TM  polarization  cases  can  be  simulated.  Consequently, 
more  complex,  realistic  integrated  optical  structures  are  straightforwardly  modeled  with 
the  NL-FDTD  approach. 

The  NL-FDTD  approach  can  handle  ultrafast  single-cycle  cases  as  readily  as  multiple- 
cycle  cases  having  an  intrinsic  carrier  wave.  Since  most  current  optical  systems  deal  directly 
with  a  carrier-wave  type  signal,  the  NL-FDTD  approach  can  simulate  the  propagation  and 
scattering  effects  associated  with  those  narrow  bandwidth  systems.  However,  it  can  also 
simulate  the  behaviors  of  the  interactions  of  ultrafast  pulses.  Ultrafast  pulses  are  single- 
cycle  or  multiple- cycle  envelopes  containing  fewer  than  15  cycles.  Sources  in  the  laboratory 
have  produced  pulses  compressed  to  as  fast  as  4  fs  and  the  optics  community  is  already 
investigating  the  attosecond  regime.  By  using  these  ultrafast  sources  we  illustrate  two 
advantages  of  the  time  domain  approach:  (1)  the  ability  to  carry  phase  information  over 
a  wide  spectrum,  and  (2)  the  ability  to  model  transient  effects  which  occur  either  quickly 
or  slowly  relative  to  the  time  scale  of  the  pulse.  The  evolution  of  the  pulse  in  the  medium 
can  be  dependent  on  both  the  material’s  resonances  in  the  presence  of  the  beam  as  well 
as  the  initial  shape  of  the  exciting  pulse.  Switching  or  steering  of  this  type  of  pulsed 
beam  requires  one  to  make  advantage  of  interference  effects  and  the  rnatcriars  transient 
response. 
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The  complex  waveguiding  structures  under  consideration  are  filled  with  either  linear  or 
nonlinear  dispersive  materials  that  have  finite  response  times.  The  corrugations  themselves 
can  be  modeled  as  dielectric  teeth  (an  extension  of  the  dielectric  waveguide)  or  metallic 
teeth  (deposited  into  or  on  top  of  the  dielectric  waveguide).  A  corrugated  waveguide  with 
these  dielectric  or  metallic  teeth  can  be  viewed  as  a  leaky^wave  antenna.  The  corrugation 
section  is  a  slow-wave  structure  whose  impedance  properties  determine  the  properties 
of  its  radiated  fields.  The  field  radiated  by  an  infinite  linear  or  nonlinear  corrugated 
structure  can  be  modeled  with  a  Floquet  mode  representation.  The  resulting  fields  have  to 
satisfy  a  phase  matching  or  Bragg  condition  resulting  from  the  electromagnetic  boundary 
conditions.  Physically  this  means  that  because  of  the  regular  placement  of  the  teeth  in  the 
corrugation  section,  the  individual  scattered  fields  will  interfer  constructively  only  along 
certain  perferred  directions  and  the  “leaked”  energy  will  appear  in  the  form  of  pulsed 
beams  that  radiate  at  angles  specified  by  the  Bragg  condition  both  into  the  air  and  into 
the  substrate  regions. 

In  particular,  let  9t  be  the  angle  that  the  radiated  beam  subtends  with  respect  to 
the  normal  of  the  waveguide,  uq  be  the  index  of  refraction  above  the  corrugations,  and 
na  =  riB  +  n2  I  he  the  index  of  refraction  in  the  waveguide,  which  includes  the  effective 
waveguide  index  Ujg  (which  varies  slightly  from  the  TE  and  TM  cases  to  achieve  the 
desired  TEq  and  TA/o  initial  spatial  amplitude  distributions)  and  the  intensity  induced 
index  change  n2  I.  This  Bragg  condition  then  takes  the  form 

lO  .  ix>  2  TT 

—  770  sindt  =  —  ng  -\-  m  ——  where  m  =  0,  ±1,  ±2, ... 

c  c  A 

or 

=  sin”^  [— -f  — /  +  m— where  ttt.  =  0,  ±1,  ±2, ...  (6) 

1  77o  no  no  A  J 

This  immediately  translates  into  a  practical  device;  the  output-beam  from  the  corrugation 
section  can  be  steered  away  from  the  normal  by  the  strength  of  the  intensity  of  the  input 
waveguide  pulsed-beam,  the  size  of  the  unit  cell  or  the  strength  of  the  nonlinearity. 

A  special  case  of  this  relationship  suggests  a  useful  output  coupler  design.  If  we  specify 
that  the  corrugation  spacing  be  A  =  A/n^,  then  the  first-order  {m  ~  —1)  output-beam 
from  the  corrugation  section  of  the  waveguide  has  the  transmission  angle: 


Thus,  the  output-beam  from  the  corrugation  section  can  be  steered  away  from  the  normal 
simply  by  adjusting  the  strength  of  the  intensity  of  the  input  waveguide  pulsed-beam  or 
the  nonlinear  index.  Our  simulations  have  confirmed  this  effect. 
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The  multi-dimensional  NL-FDTD  model  has  been  applied  to  the  modeling  of  the  ex¬ 
traction  of  energy  from  a  variety  of  linear  and  nonlinear  waveguiding  structures  using  cor¬ 
rugated  vv’aveguide  sections.  Expected  conversion  efficiencies  from  the  guided  mode  energy 
to  the  radiated  field  energy  have  been  observed  in  the  linear  case.  The  nonlinear  waveguid¬ 
ing  structures  are  presenting  interesting  challenges  in  their  analysis  and  interpretation.  A 
variety  of  TE  and  TM  cases  with  metallic  corrugations  will  be  presented  to  illustrate  the 
desired  linear  and  nonlinear  output  coupler  and  beam  steering  effects.  Typical  simulation 
geometries  are  shown  in  the  figures  below.  Since  the  electric  field  behavior  near  the  edges 
of  these  metallic  corrugations  is  significantly  different  between  the  two  polarizations,  the 
resulting  radiated  field  structures  reflect  this  difference.  Output  beam  characteristics  de¬ 
pending  on  the  medium  response  time,  the  polarization,  and  the  material  parameters,  have 
been  studied  and  will  be  reported.  Near  field  simidations  obtained  with  the  NL-FDTD 
approach  are  translated  into  far  field  information  with  near-to-far-field  transforms  tailored 
to  the  Fresnel  and  Fraunhofer  regimes.  Particular  emphasis  will  be  given  to  ultrafast  pulses 
whose  time-record  length  is  approximately  the  same  size  as  the  corrugation  region.  It  is 
found  that  even  pulses  that  are  short  in  comparison  with  the  corrugation  region  can  be 
effectively  used  to  beam  steer  and  coujde  energy  through  a  grating- assisted  coupler  from 
one  corrugated  waveguide  to  another. 
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1.  The  output  direction  of  nonlinear  grating-assisted  beam  steering,  integrated  optic 
devices  can  be  controlled  by  the  intensity  of  the  incident  pulse. 


2.  The  S-parameters  of  nonlinear  grating-assisted,  waveguide  output  couplers 
can  be  controlled  by  the  intensity  of  the  incident  pulse. 


Analysis  of  Coupled  Nonlinear  Optical  Waveguides  by  Matrix  Method 


Vijai  Tripathi  and  Andreas  Weisshaar 
Department  of  Electrical  and  Computer  Engineering 
Oregon  State  University 
Corvallis,  Oregon  97331 

H.S.  Chang 
Hanyang  University 
Seoul,  Korea 


An  improved  matrix  method  for  the  analysis  and  design  of  nonlinear  directional  couplers  (NLDC) 
with  saturable  coupling  media  is  presented.  The  method  represents  an  extension  of  the  original 
matrix  method  which  has  been  used  to  study  linear  waveguides  with  homogeneous  or 
inhomogeneous  refractive  index  profiles  and  three-layer  nonlinear  waveguides  with  Kerr-like  or 
non-Kerr-like  medium.  The  original  matrix  approach  becomes  inaccurate  for  a  structure  where  a 
nonlinear  medium  is  bounded  by  two  linear  films  of  finite  thickness.  The  extended  method  is  based 
on  an  iterative  averaging  algorithm  which  calculates  the  average  values  of  the  dielectric  constant 
and  the  field  amplitude  in  each  stratified  nonlinear  layer.  The  values  obtained  in  the  averaging 
process  converge  very  fast  and  the  method  requires  only  two  or  three  iterations  for  each  layer.  This 
numerical  method  is  applied  to  compute  the  optical  intensity  dependent  output  power  distributions 
of  symmetric  and  asymmetric  multiple  quantum  well  (MQW)  nonlinear  directional  couplers.  A  two- 
level  saturation  model  is  incorporated  with  the  matrix  method  to  consider  saturation  of  the  refractive 
index  in  the  MQW  coupling  medium.  The  optical  intensity  dependent  dispersion  characteristics  for 
individual  guided  modes  of  a  symmetric  structure  are  compared  with  the  exact  solutions  expressed 
in  terms  of  Jacobian  elliptic  functions.  To  analyze  output  power  distributions  of  the  couplers,  the 
matrix  method  utilizes  the  mode  combination  method  which  expresses  the  total  field  as  a 
combination  of  symmetric  and  antisymmetric  modes  that  are  perturbed  by  the  change  of  the 
refractive  index  profile  due  to  the  presence  of  the  other  mode.  For  a  symmetrical  structure,  the 
numerical  results  are  shown  to  be  in  agreement  with  published  experimental  data.  Typical 
simulation  results  and  applications  of  coupled  structures  are  presented. 
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Abstract 

Mandelbrot  [1]  observed  that  many  natural  objects  possess  an  inherent  self-similarity  in  their 
geometrical  structure.  In  order  to  quantify  this  behavior,  Mandelbrot  coined  the  term  fractal  and 
introduced  the  concept  of  fractal  geometry.  Since  the  pioneering  work  of  Mandelbrot  and  others, 
fractals  have  been  finding  increasing  applications  in  the  fields  of  engineering  and  science.  Of  particular 
interest  in  this  paper  is  the  research  area  known  as  fractal  electrodynamics.  The  term  fractal 
electrodynamics  was  first  suggested  by  laggard  in  1990  to  identify  the  newly  emerging  branch  of 
research  which  combines  fractal  geometry  with  Maxwell's  theory  of  electromagnetism  [2].  This  paper 
is  intended  to  present  a  brief  introduction  to  the  subject  of  fractal  electrodynamics  followed  by  an 
overview  of  significant  research  in  the  field. 

1.  Introduction 

Modeling  of  man-made  objects  has  benefited  from  the  simplicity  of  the  objects  and  a  reliance  upon 
classical  Euclidean  geometry  and  the  well  developed  mathematics  of  polynomials.  Modeling  of  natural 
or  complex  objects  have  proved  to  be  much  more  difficult.  The  traditional  approach  to  modeling 
natural  structures  has  been  to  approximate  them  with  a  collection  of  elementary  Euclidean  objects  such 
as  circles,  squares,  triangles,  cubes,  spheres,  disks,  cylinders,  cones  and  ellipsoids.  However,  naturally 
occurring  objects  typically  possess  structure  of  several  scale  lengths  which  is  very  difficult  to  accurately 
describe  in  terms  of  Euclidean  geometric  approximations.  Fractal  geometry  is  an  extension  or 
generalization  of  classical  Euclidean  geometry,  and  it  is  well  suited  for  use  in  constructing  precise 
models  of  physical  structures.  Among  the  items  which  have  been  successfully  modeled  using  fractals 
are  profiles  of  forest  tops  (vegetation  canopies),  trees,  leaves,  ferns,  edges  of  clouds,  snowflakes, 
coastline  and  sea- floor  topography. 

One  of  the  important  attributes  of  many  fractals  is  that  they  possess  a  self-similar  structure  on  all  scales 
[1],  This  designation  means  that  any  small  portion  of  a  fractal  under  high  magnification  looks  like  some 
larger  portion  under  low  magnification.  In  other  words,  these  fractals  have  the  property  that  they  are 
scale  invariant.  For  any  fractal  resulting  from  a  physical  system,  scale  invariance  exists  only  over  a 
finite  range  of  scales.  Under  these  conditions,  the  fractal  behavior  is  bounded  from  both  above  and 
below.  Physical  fractals  which  have  characteristic  inner  and  outer  scale  lengths  are  known  as 
bandlimited  fractals  [2].  The  occurrence  of  bandlimitation  can  be  attributed  to  the  fact  that  natural 
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fractals  are  often  the  result  of  regular  but  nonperiodic  forces  which  give  rise  to  complex  structures 
through  repetitive  actions. 


Fractals  can  be  quantified  and  compared  by  using  certain  numbers  which  are  associated  with  their 
behavior.  These  numbers  are  commonly  called  fractal  dimensions.  Fractal  dimensions  provide  a 
measure  of  the  degree  to  which  a  fractal  fills  the  metric  space  it  is  contained  in.  There  are  several 
definitions  of  fractal  dimension  in  use.  The  two  most  frequently  used  are  the  Hausdorff  Besicovitch 
and  the  box-counting  fractal  dimensions  [3,4].  However,  the  box-counting  definition  is  usually  used 
for  the  computational  or  experimental  determination  of  fractal  dimensions  of  physical  sets. 


The  connection  between  the  box-counting  fractal  dimension  and  our  intuitive  concept  of  dimension  can 
be  established  by  considering  the  following  example  [2,5].  Let  F  be  the  one-dimensional  line  segment 
of  length  L.  This  line  segment  can  be  divided  into  N,  identical  smaller  segments  of  length  e  which  are 
self-similar  to  L  with  a  scale  factor  of  L/N,.  Hence,  the  number  of  segments  of  length  e  that  are 
contained  in  the  line  segment  of  length  L  is 


NJL') 


(1) 


The  line  segment  of  length  e  can  be  considered  as  a  one-dimensional  yardstick  for  measuring  the  line 
segment  of  length  L.  Similarly,  let  F  be  the  two-dimensional  area  A=LL  This  area  can  be  divided  into 
N,  identical  to  smaller  areas  of  length  e  on  a  side  which  are  self-similar  to  A  with  a  scale-factor  of 
L/(N,)'^^  The  number  of  squares  of  area  that  are  contained  in  the  square  of  area  O  is 


(2) 


The  yardstick  in  this  instance  is  the  square  with  sides  of  length  e.  Next,  suppose  that  F  is  the  three- 
dimensional  volume  W=h\  This  volume  can  be  divided  into  N,  identical  smaller  volumes  of  length 
e  on  a  side  which  are  self-similar  to  V  with  a  scale  factor  of  L/(NJ'^^,  The  number  of  cubes  of  volume 
that  will  fit  inside  the  cube  of  volume  V  is 


N.(L^)  = 


(3) 


The  cube  with  sides  of  length  e  represents  the  yardstick.  Finally,  it  is  recognized  that  the  above  three 
results  represent  special  cases  of  the  measurement  of  an  n-dimensional  cube  with  volume  L  using  an 
n-dimensional  cube  with  sides  of  length  e  as  the  yardstick.  The  number  of  cubes  of  volume  e”  required 
to  fill  a  cube  of  volume  L"  is  then 
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N.(L") 


(4) 


The  above  treatment  of  self-similar  objects  which  have  Euclidean  or  integer  dimension  n  can  be 
generalized  to  include  self-similar  objects  which  have  fractional  dimension  D.  The  relationship  between 
yardstick  size  e  and  number  of  yardsticks  N,  for  a  D-dimensional  self-similar  object  is  assumed  to  be 

N,(F).c(i)°  (5) 


for  some  positive  constant  C.  Taking  the  natural  logarithm  of  both  sides  and  solving  for  D  results  in 

«nN^(F)  -  fnC 


D  = 


fn  (1/e) 


(6) 


The  box-counting  fractal  dimension  of  F  is  defined  as  the  value  of  D  to  which  the  right  hand  side  of  (6) 
converges  when  e  tends  to  zero.  Hence 


dim  g(F) 


Him  ^nN^(F) 
fn(l/£) 


(7) 


where  use  has  been  made  of  the  fact  that 


Him  {n  C 
{n(l/e) 


(8) 


A  brief  introduction  to  the  underlying  geometric  properties  of  fractals  has  been  presented  in  this  section. 
For  the  interested  reader,  a  more  in-depth  treatment  of  the  subject  may  be  found  in  the  excellent  review 
by  Jaggard  [2].  The  following  section  contains  a  summary  of  significant  research  in  the  relatively  new 
field  of  fractal  electrodynamics. 

2.  Literature  Review 

The  intent  of  this  section  is  to  present  a  brief  summary  of  research,  complete  with  references,  in  the 
discipline  of  fractal  electrodynamics.  Every  effort  has  been  made  to  provide  a  review  which  is 
comprehensive  in  scope  as  well  as  up  to  data.  The  information  contained  in  this  section  may  be  used 
to  provide  a  starting  point  for  the  serious  researcher  or  as  background  material  for  the  reader  with 
merely  a  casual  interest  in  the  subject. 
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A  special  section  devoted  to  fractals  in  electrical  engineering  was  featured  in  a  recent  issue  of  the 
Proceedings  of  the  IEEE  [6].  The  first  paper  in  this  special  section  makes  use  of  the  multiple  scale 
nature  of  wavelets  to  represent  the  1/f  family  of  self-similar  fractal  signals.  This  paper  provides  an 
illustration  of  the  connection  that  exists  between  wavelets  and  fractals.  Another  paper  in  this  special 
section  discusses  how  fractional  derivative  operators  in  electromagnetic  theory  and  superconductivity 
may  be  linked  to  fractals.  This  theory  is  applied  to  several  fractal  electrodynamics  problems  including 
electrochemical,  dielectric  and  magnetic  relaxations.  Other  papers  in  this  special  section  of  interest 
include  the  use  of  "Fractal  Brownian  Motion  Models  for  Synthetic  Aperture  Radar  Imagery  Scene 
Segmentation"  and  "Ultrasonic  Characterization  of  Fractal  Media." 

The  scattering  of  electromagnetic  waves  from  corrugated  random  surfaces  with  fractal  slopes  was 
considered  by  Jakeman  [7,8].  A  generalized  Rayleigh  solution  [9]  as  well  as  a  Kirchhoff  solution  [10] 
have  been  obtained  for  scattering  from  fractally  rough  surfaces.  The  AC  response  of  fractally  rough 
interfaces  has  been  investigated  by  Liu  and  Kaplan  [1 1].  Also,  the  important  question  of  whether  there 
is  a  radar  clutter  attractor  has  been  addressed  by  Leung  and  Hay  kin  [12].  Other  areas  of  research 
include  the  study  of  diffraction  by  bandlimited  fractal  screens  [13,14],  optical  beam  propagation  in  a 
bandlimited  fractal  medium  [15],  wave  transmission  through  a  one-dimensional  Cantor-like  fractal 
medium  [16],  reflection  from  fractal  multilayer  media  [17,18],  scattering  from  bandlimited  fractal  fibers 
[19],  and  fractal  models  of  atmospheric  refractivity  fluctuation  [20]. 

In  addition  to  the  fractal  electrodynamics  research  noted  above,  there  has  also  been  some  work  done 
in  the  area  of  fractal  antennas,  arrays  and  apertures.  The  application  of  fractals  to  the  discipline  of 
antenna  array  theory  was  first  reported  by  Kim  and  laggard  [21].  They  made  use  of  the  underlying 
order  in  fractal  geometry  to  develop  a  procedure  for  the  design  of  low  sidelobe  random  arrays.  This 
procedure  combines  the  virtues  of  periodic  subarray  generators  with  those  of  random  array  initiators 
to  form  a  quasi-random  linear  array  composed  of  self-similar  subarrays.  Allain  and  Cloitre  [22]  discuss 
properties  associated  with  the  spatial  spectrum  of  a  general  family  of  self-similar  deterministic  arrays 
which  are  constructed  recursively  by  a  certain  inflation  method.  The  problems  of  diffraction  by 
fractally  serrated  apertures  and  triadic  Cantor  targets  have  also  been  investigated  [23,24].  The  radiation 
of  electromagnetic  waves  by  fractal  structures,  known  as  fractal  radiators,  is  explored  in  [25].  In 
particular,  the  theory  of  frequency  independent  antennas  is  considered  from  the  fractal  geometric  point 
of  view.  Several  examples  of  self-similar  antennas  are  presented  including  logarithmic  spirals,  conical 
logarithmic  spirals,  and  log-periodics.  The  properties  of  fractal  arrays  are  also  briefly  discussed.  The 
fundamental  relationship  between  self-similar  fractal  arrays  and  their  ability  to  generate  radiation 
patterns  which  possess  fractal  features  is  examined  in  [26].  The  theoretical  foundation  and  design 
procedures  are  developed  in  this  paper  for  using  fractal  arrays  to  synthesize  fractal  radiation  patterns 
having  certain  desired  characteristics  (see  second  paper  of  this  section).  Finally,  a  fractal  approach  to 
lightning  radiation  on  a  tortuous  channel  is  studied  in  [27].  This  paper  demonstrates  that  the  lightning 
return  stroke  radiation  is  fractal  and  has  the  same  fractal  dimension  as  the  channel  path. 

A  review  of  fractal  electrodynamics  research  conducted  by  researchers  at  Xidian  University,  Xian, 
Shaanxi,  China,  was  included  in  the  Proceedings  of  the  1993  International  Symposium  on  Radio 
Propagation  (ISRP  '93)  [28].  Among  the  topics  addressed  are  wave  propagation  and  scattering  in 
fractal  media,  electromagnetic  scattering  from  a  one-dimensional  fractal  surface,  multiple  backscattering 
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of  millimeter  waves  from  a  random  fractal  atmosphere,  and  the  electromagnetic  scattering  from  a  fractal 
multilayered  cylinder  at  normal  incidence. 
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I.  INTRODUCTION 


This  paper  investigates  using  fractal  arrays  to  synthesize  fractal  radiation  patterns  with  certain  desired 
features.  The  theoretical  foundation  and  design  procedures  for  fractal  radiation  pattern  synthesis  are 
developed.  Generalized  Weierstrass  functions,  which  possess  fractal  characteristics,  play  a  ftindamental  role 
in  the  theory  of  fractal  radiation  pattern  synthesis.  With  the  appropriate  choice  of  array  element  spacings 
and  excitations,  bandlimited  Weierstrass  ftinctions  may  be  us^  to  express  the  array  factor  for  a 
nonuniformly  but  symmetrically  spaced  linear  array.  TTie  structure  of  resulting  fractal  radiation  patterns  can 
be  controlled  over  a  finite  range  of  scales  by  the  number  of  elements  in  the  array.  The  fractal  dimension  of 
the  radiation  pattern  for  a  fixed  array  geometry  may  be  varied  by  changing  the  array  current  distribution.  A 
unique  synthesis  technique  is  developed  which  is  based  on  Fourier-Weierstrass  expansions.  This  technique 
allows  the  selection  of  an  appropriate  generating  function,  in  addition  to  the  dimension,  for  a  desired  fractal 
radiation  pattern.  The  fractal  arrays  which  result  from  this  procedure  are  composed  of  a  sequence  of  self¬ 
similar  uniformly  spaced  linear  subarrays. 

2.  WEIERSTRASS  FRACTAL  ARRAYS 


Fractals  can  be  quantified  and  compared  by  using  certain  numbers  which  are  related  to  their  behavior. 
These  numbers  are  commonly  called  fractal  dimensions.  Fractal  dimensions  provide  a  measure  of  the 
degree  to  which  a  fractal  fills  the  metric  space  it  is  contained  in.  The  box-counting  or  box  definition  is 
usually  used  for  the  computational  or  empirical  determination  of  fractal  dimensions.  For  a  given  fractal  F, 
the  box-counting  fractal  dimension,  denoted  by  dimB(/0,  is  defined  as  [1] 


dimg(F) 


dm 

^^0  fn(l/5) 


(I) 


where  N,  represents  the  smallest  number  of  sets  of  diameter  at  most  S  required  to  cover  the  fractal  F.  The 
class  of  functions  known  as  generalized  Weierstrass  functions  are  represented  by 
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(2) 


fw  =  E  g(Ti“x) 

n=l 

where  1<D<2,  r]>l,  and  ^  is  a  suitable  bounded  periodic  function  [1,2].  These  generalized  Weierstrass 
functions  have  the  property  that  they  are  everywhere  continuous  but  nowhere  differentiable  and  exhibit 
fractal  behavior  at  all  scales.  The  fractal  dimension  D,  in  this  case,  is  a  fractional  dimension  which  lies 
between  the  integer  dimensions  of  one  and  two.  Generalized  Weierstrass  functions  are  the  foundation  on 
which  the  theory  of  fractal  radiation  pattern  synthesis  is  based. 

The  array  factor  for  a  nonuniformly  but  symmetrically  spaced  linear  array  of  2N  elements  may  be 
expressed  in  terms  of  a  bandlimited  Weierstrass  function  as 

N 

=  2  E  IoCOs(kd„u  +  a„) 
n=l 

provided  the  current  amplitudes  and  element  spacings  are  chosen  according  to 

I  =  (4a) 

n  I 


kd„  =  ail” 

with  ij>I,  1<D<2,  u  =  COS0,  k  =  2ir/X  and  X  is  the  free-space  wavelength.  The  Weierstrass  partial  sum 
of  (3)  may  be  classified  as  bandlimited  since  the  resulting  radiation  pattern  only  exhibits  fractal  behavior 
over  a  finite  range  of  scales.  The  structure  of  the  radiation  pattern  becomes  finer  and  more  detailed  as  the 
number  of  array  elements  is  increased. 


A  normalized  form  of  the  Weierstrass  array  factor  can  be  obtained  by  dividing  (3)  by  its  maximum  value. 
The  expression  for  this  normalized  array  factor  is 


gN(u)  = 


1  - 
1  - 


N 

Y,  cos{ar|“u  +  a„) 
n=l 


(5) 


where 


^  ^(D-2){n-l) 


(6) 


represent  the  normalized  excitation  current  amplitudes.  Let  r  be  a  constraint  which  is  imposed  on  the 
minimum  separation  between  any  two  consecutive  elements  in  the  array.  There  are  two  possible  cases  in 
which  the  minimum  spacing  constraint  may  be  satisfied.  These  cases  are  as  follows: 
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1)  dj  -  dj  =  T  and  dj  ^  ^ 

2)  ^  and  dj  -  d^  ^  t 


which  may  be  used  to  derive  an  expression  for  a  as  a  function  of  r  and  r? 

kx 


n(ii-i) 


,  1  <  Tl  ^  3 


2il 


T)  £  3 


(7a) 

(7b) 


(8) 


3.  FRACTAL  LINE  SOURCES 

For  a  line  source  of  infinite  length,  the  radiation  pattern  F(u)  and  the  current  distribution  I(s)  are  related  by 
the  following  Fourier  transform  pair  [3]: 


where 


F(u)  =  f  I(s)  ds 


I(s)  =  F(u)  du 


U  =  COS0 
s  =  zjX 


(9a) 

(9b) 

(lOa) 

(10b) 


In  particular,  suppose  that  the  radiation  pattern  of  an  infinite  line  source  may  be  represented  as  a 
bandlimited  generalized  Weierstrass  function  of  the  form 

N-1 

F(u)  =  52  Ti^  g(Ti“u)  (11) 

n=0 


where  D  is  the  fractal  dimension  and  g(u)  is  a  generating  function.  Here  we  assume  that  the  generating 
function  g(u)  is  periodic  and  even,  i.e.  g(u  +  2)=g(u)  and  g(-u)  =  g(u).  Hence,  g(u)  may  be  expanded  in  a 
Fourier  cosine  series  as 
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(12) 


3 

g(«)  =  T  ^  E  a„cos(m7iu) 
2  in=l 


where  the  Fourier  coefficients  are  determined  from 

1 

=  2  g(u)  cos(mnu)  du 
0 

Substituting  (12)  into  (11)  and  replacing  u  by  u+1  maps  the  interval  [-1,1]  to  the  interval  [0,2].  This 
results  in 


F(u)  = 


fo 

2 


-  1 

^(D-2)  _  1 


*  E 

m=l 


N-1 

52  ,^(D-2)n  cOS[in7tTl”(u  +  l)] 

n=0 


(14) 


where 


1 

=  2  J  g(u-l)cos(mnu)  du 
0 


(15) 


with  the  requirements  that  tj>  1  and  1  <D<2.  We  call  such  a  representation  a  Fourier-Weierstrass 
expansion. 

The  line  source  current  distribution  required  in  order  to  produce  the  desired  fractal  radiation  patterns  may 
be  obtained  by  evaluating  the  Fourier  integral  (9b).  Performing  the  necessary 
integration  results  in  an  expression  for  the  current  distribution  given  by 

_(D-2)N  _  1 

m  -  - -  sine  (2ns) 


«  N-1  (16) 

+  E  E  sinc[2ns  -  mitn“] 

m=l  n=0 


+  sinc[2ns  +  mnr|“]} 


An  approximation  may  be  obtained  for  the  current  distribution  on  a  finite  length  line  source  by  truncating 
(16)  in  the  following  way: 


I(s) 


I(s) 

0  , 


2k 


s  > 


JL 

2k 


(17) 


Following  this  procedure  yields 
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(±1)  =  ±  1  Y  +  to  [47i(L/^)] 


-Ci  [471  (L/X)]  [  for  ran”  =  L/X 


Si(x)  =  /  ^  dt 
0  * 


Ci(x)  =  -  f  —  dt  =  Y  +  to(x)  +  f 


and  the  parameter  y  =  0.57721...  is  Euler’s  constant. 

4.  FOURIER-WEIERSTRASS  FRACTAL  ARRAYS 

We  begin  our  study  of  bandlimited  Fourier-Weierstrass  fractal  arrays  by  expressing  (14)  in  the  following 
convenient  form; 

»  N-1 

F(u)  =  lo  +  2  5]  U  cos(kd^u  +  a^)  ^ 

m=l  n=0 


^  1  - 
2  1  -  t,(d-2) 


kd„  =  m-rtn” 


and  the  Fourier  coefficients  corresponding  to  a  particular  generating  function  may  be  obtained  through 
the  use  of  (15).  A  useful  representation  of  the  array  factor  for  a  Fourier-Weierstrass  array  with  a  finite 
number  of  elements  may  be  obtained  by  simply  truncating  the  outer  summation  in  (21)  and  interchanging 


the  order  of  summation.  This  leads  to  an  approximate  expression  for  the  desired  fractal  radiation  pattern 
given  by 


N-1  M 

F(u)  =  I„  .  2  5:  5:  I„ 

n=0  m=l 


cos(kd^  u  + 


(23) 


The  double  summation  appearing  in  (23)  may  be  interpreted  as  representing  the  superposition  of  radiation 
produced  by  a  sequence  of  N  uniformly  spaced  M-element  linear  arrays.  The  recurrence  relation  of  the 
element  spacings  for  each  of  the  M-element  subarrays  is 

A..,  .  t,  A,  wilh  A„  =  A  (24) 


This  unique  property  of  Fourier-Weierstrass  arrays  reveals  their  underlying  fractal  structure  by  suggesting 
that  they  are  composed  of  a  sequence  of  self-similar  uniformly  spaced  linear  subarrays.  Recurrence 
relations  for  the  excitation  current  amplitudes  and  phases  may  be  found  in  a  similar  way 


=  Imn  with  =  -^ 

(25a) 

1  =  ti  with  =  mjr 

(25b) 

Finally,  an  expression  for  the  normalized  array  current  excitation  amplitudes  may  be  obtained  by  dividing 
{22b)  by  (22a),  which  yields 

io  =  1  (26a) 


1  -  ^ 

1  - 


^(D-2)a 


(26b) 


Figure  1  shows  a  synthesized  radiation  pattern  formed  by  a  triangular  generating  function  with  ij  =  2  and  a 
desired  fractal  dimension  of  D=  1. 1.  A  Fourier-Weierstrass  array  with  M=4  and  N  =  8  was  used  to 
synthesize  this  radiation  pattern.  Various  stages  in  the  construction  of  the  Fourier-Weierstrass  array  with 
r}=2  and  a  triangular  generating  function  are  illustrated  in  Figure  2.  The  self-similarity  property  of  the 
subarrays  is  also  clearly  identifiable  in  Figure  2. 


REFERENCES 


[ij  K.  Falconer,  Fractal  Geometry,  Wiley,  New  York,  1990. 

[2]  M.  V.  Berry,  Z.  V.  Lewis,  On  the  Weierstrass-Mandelbrot  fractal  function,  Proc.  R.  Soc.  Lond.  A, 
370,  pp.  459-484,  1980. 

[3]  W.  L.  Stutzman,  G.  A.  Thiele,  Antenna  Theory  and  Desiy^n,  John  Wiley  &  Sons,  New 
York,  1981. 


976 


Normalized  Array  Factor 


977 


Element 
at  Origin 


Wavelet  Transforms  and  Time/Time-scale  Analysis 

(Tutorial  Session) 


Randy  K.  Young,  PhD 

Applied  Research  Laboratory,  The  Pennsylvania 
State  University 

P.  O.  Box  30,  State  College,  PA  16804-0030 
(814)  863-4499  Fax;  (814)  863-784 
E-mail:  rky@arl.psu.edu 


Tommy  G.  Golsberry 
Office  of  Naval  Research 
Code  321 

800  N.  Quincy  St.,  Arlington,  VA  22217-5660 
(703)  696-0805 


Fractal  theories  exploit  the  multi-scale  self-similarity  properties  that  nearly  define  wavelet  transforms.  However,  a 
critical  component  of  fractal  analysis  is  that  the  self-similar  "partem"  may  not  be  apriori  known;  e.g.,  the  underlying 
pattern  may  need  to  be  extracted  from  the  observed  phenomenon,  as  in  image  compression.  This  "kernel"  function  or 
underlying  pattern  is  analogous  to  wavelet  analysis's  "analyzing  or  mother  wavelet;"  these  analyzing  functions  may  not 
be  apriori  known,  could  be  arbitrarily  chosen,  or  may  be  extracted  from  the  signal  being  analyzed  as  well.  In  addition, 
similar  to  fractal  analysis,  very  fine  scale  steps  or  scale  resolution  can  be  employed.  Although  much  of  orthogonal 
wavelet  theory  concentrates  on  very  coarse  scale  steps  (powers  of  2),  general  continous  wavelet  transform  theory  allows 
arbitrarily  fine  scaling.  For  these  more  general  wavelet  transforms,  the  properties  of  the  resulting  representation  depend 
intimately  upon  the  properties  of  the  analyzing  or  mother  wavelets.  Thus,  the  wavelet  analysis  applied  in  fractal 
analysis  should  include  very  general  wavelet  transforms  and  not  be  limited  to  the  often-used  orthogonal  transforms. 

This  introduction  and  tutorial  on  wavelet  and  time/time-scale  (space  and  tirae  in  general)  analysis  will  provide  a 
framework  for  understanding  these  theories  and  intertwining  them  with  established  analysis  techniques.  The  general 
class  of  wavelet  transforms  will  be  divided  into  orthogonal  and  nonorthogonal  wavelet  transforms  and  these  will 
examined  and  compared  to  conventional  Fourier  transform  techniques.  The  broad  utility  of  wavelet  transforms  has  been 
initially  thrust  by  the  efficiency  of  orthogonal  (or  nearly  orthogonal)  wavelet  transforms.  Image/video  processing  has 
been  substantially  impacted. 

Although  space-time- varying  systems  theory  can  also  be  naturally  formulated  with  wavelet  transforms,  this  brief 
introduction  can  not  adequately  over  this  topic.  However,  it  should  be  noted  that  the  scaling  operator  of  the  wavelet 
transform  can  warp  either  the  space  or  time  coordinates  or  both;  this  scaling  action  can  account  for  rclativbtic  effects 
and  reference  frame  motion.  For  space-time-varying  systems,  the  wavelet  transform  is  the  "natural  transform," 
analogous  to  the  Fourier  transform  being  the  "natural  transform"  for  linear  time  (space)  invariant  systems  (LTI 
systems);  however,  unlike  the  LTI  systems  having  exponential  functions  as  eigenfunctions,  the  space-time-varying 
systems  do  not  have  analogous  eigenfunctions. 

These  relationships  and  analogies  will  be  detailed  to  provide  the  interconnections  between  transfomi  techniques  and  to 
establish  the  limitations  of  these  analogies.  The  impact  of  these  techniques  in  computational  electromagnetics  is  already 
significant  and  the  potential  for  further,  more  diversified  impacts,  is  also  considerable. 

INTRODUCTION/BACKGROUND 

Wavelet  theory  is  the  mathematics  associated  with  building  a  model  for  a  signal,  system,  or  process  with  a  set  of 
"special  signals."  The  special  signals  are  just  little  waves  or  "wavelets."  They  must  be  oscillatory  (waves)  and  have 
amplitudes  which  quickly  decay  to  zero  in  both  the  positive  and  negative  directions  (little).  See  Figure  1.1  for  an 
example  of  a  wavelet  (this  is  a  classical  wavelet,  termed  the  "Morlet  mother  wavelet,"  after  its  inventor).  The  required 
oscillatcMy  condition  leads  to  sinusoids  as  the  building  blocks  (see  Figure  1.2).  The  quick  decay  condition  is  a  tapering 
or  windowing  operation.  These  two  conditions  must  be  simultaneously  satisfied  for  the  function  to  be  a  little  wave  or 
wavelet.  Forming  the  product  of  the  oscillatory  and  decay  functions  yields  the  wavelet  of  Figure  1.1. 
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Figur*  l.l:  Morlet  Mother  Wavelet 


Wave,  but  not  'little'  - 
never  decays 


wavelets 

Figure  1.2:  Oscillatory  or  Wave 
Retjuirement 


Sets  of  "wavelets"  are  employed  to  approximate  a  signal  (or  process,  or  system,  etc.)  and  each  element  in  the  wavelet 
set  is  constructed  from  the  same  function,  the  origiral  wavelet,  appropriately  called  the  mother  wavelet.  Each  element 
of  the  wavelet  set  is  a  scaled  (dilated  or  compressed)  and  translated  (shifted)  mother  wavelet.  Wavelet  theory  is  a 
mathematical  tool  that  can  be  applied  almost  anywhere  and  like  most  tools,  its  primary  purpose  is  to  improve  the 
efficiency  (analogous  to  a  wrench  turning  a  bolt  rather  than  using  your  fingers).  When  wavelet  representations  are 
more  efficient  than  alternative  representations  for  a  particular  application,  wavelet  transform  should  be  employed  there. 
In  some  appUcatic«s  wavelet  theory  may  produce  undesirable  inefficiencies  and,  thus,  wavelet  theory  should  not  be 
blindly  forced  upon  an  application.  But,  for  "fractal/chaotic"  problems  where  the  multiscale  self-similarity  exists, 
wavelet  transform  are  naturally  efficient. 


The  Wavelet  Transform 

Before  defining  the  wavelet  transform,  admissible  functions  are  defined.  For  a  function  to  be  a  mother  wavelet  it  must 
be  admissible.  Recall,  from  earlier  discussion  that  for  a  function  to  be  a  wavelet  or  mother  wavelet  it  must  be 
oscillatory  and  have  fast  decay  toward  zero.  If  these  condidons  are  combined  with  the  condition  that  the  wavelet  must 
also  integrate  to  zero  (its  "d.c."  or  zero  frequency  component  is  zero),  then  these  three  conditions  are  the  "non-rigorous" 
admissibility  condition  that  must  be  satisfied  for  a  function  to  be  a  mother  wavelet.  Essentially,  admissible  functions 
are  bandpass  signals  -  these  signals  cannot  have  zero  frequency  components  and  they  must  decay,  so  they  will  not  have 
infinite  frequency  components  either.  Note  that  most  signals  that  travel  through  a  medium  or  in  free -space  are  finite 
duration,  bandpass  waves,  so  this  requirement  is  not  very  restrictive.  Although  a  Lapbee  transform  also  includes  a 
kernel  function  (a  decaying  exponential)  that  both  decays  and  oscillates,  the  decay  is  always  centered  around  zero 
(unlike  these  wavelets),  frequency  shifts  are  used  (instead  of  time  scaling),  and  the  kernel  of  the  Laplace  transform  is  an 
exponential  (exclusively).  As  will  be  emphasized  throughout  this  book,  the  mother  wavelet  (kernel  of  the  wavelet 
transform)  can  be  almost  any  function. 

More  rigorously,  an  L  ^(R)  function  (a  finite  energy  function  -  square  integrable  over  the  range  of  its  independent 
variable),  g,  which  can  be  either  real  or  complex,  is  an  admissible  function  if: 


c  =  f  r/co  <  00  (1.1) 

L  I  “  I 

where  6^((0)  is  the  Fourier  transform  of  g.  Note  that  the  lower  limit  of  integration  is  minus  infinity  instead  of  0.  This 
is  required  if  the  mother  wavelet  is  complex  and  has  a  spectrum  that  is  nonsymnietric  about  zero  frequency.  This 
admissibility  condition  is  sufficient  (may  be  more  restrictive  than  required)  but  is  not  necessary  (some  functions  are 
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admissible  but  do  not  satisfy  this  condition).  A  more  general  necessary  and  sufficient  condition  (the  complete  definition 
of  admissibility)  for  functions  to  be  admissible  is  defined  with  group  theoretic  concepts  in  [Gro2,  Hei].  In  summary, 
admissible  functions  (and  mother  wavelets)  are  those  that  cycle  (oscillate),  have  finite  energy,  and  have  an  average 
value  of  zero.  Most  natural  signals  satisfy  these  properties;  energy  usually  travels  as  wave  packets  (oscillates  and  has 
an  average  value  of  zero  -  its  average  square  value  is  the  energy).  Thus,  most  natural  signals  would  classify  as 
admissible  functions. 

The  wavelet  transform  operator,  maps  a  finite  energy  or  L  ^(E)  signal  that  is  real  or  complex  valued  as  follows: 
jr:/:^(R)-*£^(R\{0)xR).  Stated  less  mathematically,  any  finite  energy  signal  is  mapped  from  the  time  or 
space  domain  to  a  finite  energy  twoKlimensional  distribution  in  the  scale-translation  or  wavelet  domain.  The 
continuous  wavelet  transform  of  a  function,  /,  with  respect  to  a  given  admissible  mother  wavelet,  g,  is  defined  as: 

wavelet  domain  coeff  at  scale  a  and  translation  b 

-  W/{a,b)  =  t' 2/ W  (1.2) 

where  superscript  denotes  complex  conjugate  and  <,’>  is  an  inner  product  (shorthand  notation  for  the  correlation 
integral  defined  in  this  equation).  Note  that  this  definition  requires  g{x)  to  be  an  admissible  functiim.  The  wavelet 

element,  is  defined  by  a  unitary  affine  mapping  (/(a,6):  or,  less  mathematically,  g^j^  is  a 

veisirai  of  the  mother  wavelet,  g{x),  that  has  been  scaled  by  the  scale  parameter,  a,  and  translated  by  the  translation 
parameter,  b.  Note  that  for  continuous  wavelet  transforms  the  choice  of  the  mother  wavelet  is  only  constrained  by  the 
admissibility  condition.  One  is  free  to  choose  the  mother  wavelet  for  optimal  behavior  in  the  particular  appUcation  of 
interest. 

Wavelet  Transform  Mapping 


Pigur*  1.3:  Wavelet  Transform  of  f  with  respect 
to  g 

The  wavelet  transform  can  be  related  to  the  more  commonly  used  Fourier  transform  or  Fourier  series.  The  Fourier 
models  represent  functions  as  weighted  sum  of  exponentials  at  different  frequencies.  The  weight  at  each  different 
frequency  is  the  Fourier  coefficients.  Wavelet  models  analogously  represent  functions  as  a  weighted  sum  of  scaled  and 
translated  mother  wavelets.  The  wavelet  transform  has  a  mother  wavelet  replace  the  exponential,  scaling  and  translation 
replace  frequency  shifting,  and  a  two-dimensional  surface  of  wavelet  coefficients  replace  the  one  dimensional  Founer 
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coefficients.  As  a  special  case  where  the  mother  wavelet  is  a  =  — ,  and  i?  =  0,  then  the  wavelet  transfonn  in 

(a) 

equation  (1.2)  becomes; 

which  is  a  Fourier  transform.  Rigorously,  several  mathematical  difficulties  arise  with  this  substitution,  but  the  intuitive 
mterpretation  and  inverse  relationship  between  frequency  and  scale  are  the  desired  results. 


Wavelet  Transfonn  Examples 


Tlie  wavelet  transforms  of  several  elementary  functions  are  presented  in  the  next  four  figures  to  display  the 
characteristics  of  the  wavelet  domain  representation.  Note  the  increase  in  dimensionality  of  the  wavelet  representation; 
the  dimensionality  increases  by  a  factor  of  two.  For  the  first  three  wavelet  transforms  the  mother  wavelet  was  a 
complex  Morlet  mother  wavelet  (or  a  Gaussian  weighted  tone)  with  about  six  significant  cycles  in  it  (the  imaginary  part 
of  this  complex  mother  wavelet  was  shown  in  Figure  1.1,  Figure  1.1),  For  these  figures  a  30  dB  range  of  magnim^  is 
displayed  -  if  the  magnitude  was  less  than  30  dB  below  the  peak,  it  was  set  to  zero  along  with  its  corresponding  phase. 
These  figures  are  not  studied  in  detail  until  the  wavelet  transform  properties  are  examined;  however,  note  the  time 
localization  property  of  the  wavelet  transform  for  the  impulse  (or  delta  function)  and  the  edges  of  the  rectangle. 
Frequency  or  scale  localization  can  also  be  observed  from  the  wavelet  transform  of  a  tone  (sinusoid). 


Several  important  conclusions  can  be  made  from  these  figures.  First,  note  that  both  the  magnitude  and  the  phase 
localize  in  both  time  and  scale  (frequency).  The  localization  in  phase  is  observed  from  the  converging  phase  ridges  (not 
the  zeroing  due  to  small  magnitudes).  The  simultaneous  localization  is  due  to  the  entire  envelope  being  moved;  the 
mother  wavelet  is  translated.  For  the  impulse  or  delta  function,  the  wavelet  transform  will  simply  be  the  mother 
wavelet  with  translation  replacing  the  time  parameter  (look  at  equation  (1.1)  with  the  mput  being  an  impulse  * 


At  each  different  scale  the  wavelet  distribution  is  a  scaled  mother  wavelet. 


Note  that  the  phase  of  the  wavelet  transform  and  the  real  part  of  the  wavelet  transform  are  essentially  redundant 
information  (and,  thus,  look  alike).  To  be  consistent  with  matched  filter  or  correlation  processing,  the  phase  will  be 
chosen  over  the  real  part  for  the  representations  in  this  book 

For  all  of  these  cases  the  mother  wavelet  was  complex.  For  both  the  impulse  and  the  rectangle,  the  input  was  real.  If  a 
real  mother  wavelet  was  used  (the  real  part  of  the  complex  mother  wavelet),  then  only  the  real  part  of  the  wavelet 
transform  would  be  obtained.  The  real  part  of  the  wavelet  transfonn  can  be  very  poor  for  identification/detection.  If 
the  wavelet  transform  is  ocJy  evaluated  at  points  that  are  near  the  nulls  of  the  real  part  of  the  transform,  then  those 
evaluated  points  can  be  sensitive  to  noise  or  might  suggest  a  panicular  feature  of  the  signal  is  not  present  when  it  really 
is  present  These  nulls  in  the  real  part  can  lead  to  incorrect  results  and  invalid  conclusions.  This  is  similar  to  matched 
filter  processing  with  real  signals  as  opposed  to  complex  signals;  processing  with  a  real  signal  produces  nulls  in  the 
matched  filter's  output  and  leads  to  sensitivities  and  a  less  robust  filter.  Usually,  for  general  signal  processing  it  will  be 
desirable  to  have  complex  mother  wavelets  to  avoid  the  possibility  of  a  null  (this  can  be  inteipreted  as  a  constraint  on 
the  density  of  the  wavelet  domain  "hypothesis  grid"). 


As  mentioned  previously,  and  emphasized  throughout  this  paper,  the  wavelet  transform  representation  and  its 
properties  are  dictated  by  the  mother  wavelet  Because  the  set  of  mother  wavelets  is  so  large  (essentially  any 
bandpass  signal)  the  freedom  of  choosing  a  mother  wavelet  makes  general  wavelet  transform  characterizations  of  a 


982 


Wavelet  Transform  of  Impulse. 


particular  functicm  nearly  impossible.  A  valid  statement  for  one  mother  wavelet  can  be  completely  invalid  for 
another  mother  wavelet  As  an  example  consider  the  wavelet  transform  of  the  same  rectangular  signal  as  in  Figure 
E.2,  but  now  with  respect  to  a  FM  signal  with  both  quadratic  and  linear  modulation  (a  sophisticated  mother  wavelet). 
This  new  wavelet  transform's  magnitude  and  phase  are  displayed  in  Figure  E.4.  Both  Figure  E.2  and  E.4  are  wavelet 
transforms  of  the  rectangle  signal;  however,  the  qualities  of  the  signal  in  the  transform  domain  are  more  distinguishable 
in  Figure  E.2  (such  as  edges  of  the  signal).  For  general  signal  qualities  rej^esented  m  the  wavelet  transform  domain,  a 
simple  mother  wavelet  should  be  employed.  For  many  other  applications  where  gain  and  resolution  are  important  more 
complicated  or  sophisticated  mother  wavelets  might  be  used. 

The  characterization  of  mother  wavelets  being  "bandpass"  can  be  deceiving.  A  mother  wavelet  can  be  wideband  and 
have  multiple  simultaneous  frequencies  that  are  at  significantly  different  frequencies;  the  mother  wavelet  itself  can  be 
multi-modal,  etc.  The  Fourier  spectrum  of  a  mother  wavelet  can  have  many  "holes"  or  frequencies  with  very  little 
energy  in  between  frequencies  that  have  a  lot  of  energy.  The  admissible  constraint  is  not  a  very  restrictive  condition 
and  the  bandpass  interpretation  should  not  be  accepted  too  literally.  For  a  further  tutorial  discussion  with  more  wavelet 
transform  examples  and  properties  refer  to  the  references  [Com,  Rio,  Wei,  You]. 

Discrete  Time  Wavelet  Series  -  A  Specific  Structure 


The  discrete  time  wavelet  series  is  discrete  in  both  the  time  domain  and  the  scale-translation  (wavelet)  domain.  Since 
time  dilation  by  »  factor  of  2  can  be  efficiently  implemented  simply  by  dropping  every  other  sample  of  a  discrete 
signal  (decimating  or  subsampling  by  a  factor  of  2),  the  transform  to  be  presented  here  only  considers  time  scaling  by 
powers  of  2.  The  structure  of  this  particular  wavelet  decomposition  is  presented  in  Figures  1.4  and  1.5.  This  structure 
can  be  valid  for  multiresolution,  orthogonal,  biorthogonal,  or  PR-QMFs,  with  each  case  specifying  the  requirements  for 
the  filter  coefficients.  The  filters  are  simply  denoted  as  low  pass,  g(k),  and  high  pass,  h(k),  fdiers  to  present  a  general 
form,  but  these  filters  are  intimately  related  to  the  mother  wavelet  of  the  wavelet  transforms  [Com,  Dau,  Mai,  Mey,  Rio, 
Vet].  The  high  pass  filter,  h(k),  is  usually  considered  as  the  "mother  wavelet"  (the  order  of  the  coefficients  changes  in 
some  cases)  and  the  outputs  of  the  high  pass  filters  are  thus  the  wavelet  coefficients  (the  high  pass  filter  convolves  its 
impulse  response  (mother  wavelet)  with  the  incoming  signal  to  aeate  an  output  (a  sequence  of  wavelet  coefficients)). 


Discrete  Time  Wavelet  Series  Block 
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Figurs  1.4:  DTWS  Processing  Block 
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Figure  1,5:  DTWS  Structure 


The  decomposition  process  is  demonstrated  in  Figure  1.4.  The  entire  DTWS  decomposition  consists  of  passing  the 
signal  through  identically  structured  processing  "blocks."  Each  block  is  defined  to  have  both  a  low  pass  filter  (the 
"scaling  function"  discussed  in  wavelet  literature)  and  a  high  pass  filter  (the  "wavelet  function").  The  output  of  each 
filter  is  decimated  by  a  factor  of  2.  The  outputs  of  the  lowpass  filter  are  forwarded  to  the  next  DTWS  block.  The 
outputs  of  the  high  pass  filters  are  the  wavelet  coefficients.  These  coefficients  are  the  new  representation  of  the  signal. 
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Multiple  "band  splitting"  blocks  are  cascaded  to  form  a  DTWS.  The  input  signal  goes  in  from  the  left,  a  series  of 
wavelet  coefTicients  conies  out  the  bottom,  and  a  final  low  frequency  time  series  exits  out  the  right  The  low  frequency 
signal  and  the  wavelet  coefficients  together  represent  the  time  domain  signal.  This  is  one  example  of  the  DTWS.  The 
"fOTward"  transfonn  is  often  termed  the  analysis  filter  or  analysis  stage.  The  pyramidal  structure  of  this  DTWS  results 
because  fewer  and  fewer  coefficients  are  output  from  each  successive  stage  until  the  last,  single  coefficient  is  output  at 
the  end  of  the  filter  stages  (or  the  peak  of  the  pyramid).  The  pyramidal  decomposition  is  more  easily  viewed  in  the 
frequency  domain  and  is  the  multiresolution  wavelet  transforms  (refer  to  Figure  1.6  to  see  the  pyramidal  decomposition 
in  the  frequency  domain). 

Multiresolution  Pyramid 


Function, 

f.tobe  /  ^ 

decomposed 


Coefficients 


HPF=wavelet 
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Composite  Bandwidth* 


Pigur*  1.6;  Multireeolution  Wavelet 
Transform  Structure 


From  the  list  of  special  wavelet  transforms,  the  multiresolution  wavelet  transforms  are  the  most  general  [Mai,  Mey]. 
Multiresolution  wavelet  transforms  allow  the  mother  wavelets  to  be  nonorthogonal  and  have  many  other  properties.  The 
primary  constraint  on  the  mother  wavelet  (or  high  pass  filter)  is  really  formulated  on  a  different  function,  the  scaling 
function  (or  low  pass  filter).  Multiresolution  transforms  "build-in"  a  pyramidal  structure  that  is  not  required  for  general 
wavelet  transforms.  The  pyramidal  structure  requires  a  repetitive  application  of  the  same  (but  scaled)  scaling  and 
wavelet  functions,  or  lowpass  and  highpass  filters,  respectively.  This  pyramidal  structure  forces  the  scaling  function  to 
satisfy  a  amstraint  termed  the  two-scale  equation  (originally  [Mai]  and  detailed  in  [Chu]).  In  addition,  the 
multiresolution  wavelet  transforms  often  begin  with  a  scaling  function  that  is  derived  from  a  spline  function  (splines  are 
usually  simple  functions,  such  as  polynomials  that  can  be  efficiently  represented).  Many  desirable  advantages  exist  for 
using  splines  to  derive  mother  wavelets  (and  these  are  detailed  by  Chui  [Chu])  but  the  constraints  imposed  on  the 
mother  wavelet  (the  filters)  limits  the  set  of  possible  mother  wavelets.  Further  details  of  the  mathematics  are  deferred 
to  the  many  references  on  multiresolution  wavelet  transfotms  [Chu,  Com,  Dau,  Mai,  Mey,  Vet]. 

However,  the  standard  application  of  the  raultiresolution  wavelet  transform  is  to  form  a  series  of  half-band  filters  that 
divide  a  spectrum  into  a  high  frequency  band  and  a  low  frequency  band.  These  filters  initially  act  on  the  entire  signal 
bandwidth  and,  thus,  act  at  the  high  frequencies  (small  scale  values)  first  and  gradually  reduce  the  signal  bandwidth 
with  each  stage  (smaller  bandwidths  correspond  to  larger  scales).  See  Figure  1.6.  The  high  frequency  band  output  is 
taken  as  the  wavelet  transform  coefficients  for  a  "fine"  scale,  and  the  low  frequency  band  output  is  decimated  by  a 
factor  of  2  (every  other  sample  is  discarded).  This  low  frequency  band  is  then  split  into  a  high  and  low  band;  this  band 
splitting  and  decimation  process  continues  and  produces  an  octave  band  representation  of  the  signal.  The  wavelet 
coefficients  for  a  particular  scale  are  the  output  samples  of  a  particular  high  pass  filter.  Each  different  output  sample 
corresponds  to  a  different  translation  at  that  particular  scale.  The  output  rate  of  each  of  these  filters  is  decimated  by  a 
factor  of  two  as  the  scale  value  steps  in  the  coarse  direction.  The  pyramidal  structure  in  Figure  1.1  result  from  the 
recursive  structure  of  the  multiresolution  wavelet  transform.  The  high  pass  filter  outputs  (wavelet  coefficients)  represent 
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the  signal's  characteristics  and  energy  at  a  particular  scale.  The  output  of  the  final  lowpass  filter  is  the  residual  or  "d.c." 
portion  of  the  signal  -  the  most  blurred  (most  coarse)  signal. 


For  this  interpretation  of  the  muittresolution  wavelet  transform  it  would  be  confusing  if  the  mother  wavelet  was 
sophisticated  (a  large  bandwidth  or  quickly  changing  characteristics);  a  frequency  "band"  may  not  make  sense.  If  a 
sophisticated  mother  wavelet  was  used,  the  filters  may  have  very  sharp  peaks  or  have  multi-modal  shapes;  these  are  not 
acceptable  in  the  standard  multiresolution  analysis.  Thus,  as  with  the  other  constrained  wavelet  transforms,  the  mother 
wavelets  are  assumed  to  be  "bandpass"  and  not  be  sophisticated.  The  "sophisticated"  mother  wavelets  to  be  examined 
later,  are  not  necessarily  characterized  as  bandpass.  Subsequent  ambiguity  analysis  will  provide  insight  into  the 
characteristics  of  these  sophisticated  mother  wavelets.  In  some  practical  applications  (especially  images)  the  signal 
information  does  appear  to  follow  an  octave  band  distribution,  in  many  other  applications  the  information  does  not. 

CONCLUSIONS 

Because  fractal  techniques  exploit:  1)  apriori  unknown  signals/functions,  and  2)  the  self-similarity  property  to  eliminate 
redundancy  at,  possibly  unknown  "scale  steps;"  the  utilization  of  wavelet  techniques  may  require  the  more  general 
itonorthogonal  wavelet  transforms  that  1)  do  not  highly  constrain  the  mother  or  analyzing  wavelet,  and  2)  allow  the 
scale  resolution  to  be  nearly  arbitrary  as  well.  These  features  are  not  nessarily  offered  by  the  orthogonal  wavelet 
transform  or  multiresolution  wavelet  transform  techniques. 
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ABSTRACT 

In  many  newer  communications  and  sensing  systems  the  utilization  of  broadband,  high  time- 
bandwidth  product  signals  mandates  the  more  general  time-scaling  to  be  utilized  in  the  place  of 
Doppler  shifting.  CDMA  and  spread  spectrum  techniques  are  becoming  common  place  in  wireless 
communications.  This  pap>er  demonstrates  the  added  value  of  nonorthogonal  wavelet  transforms, 
employed  to  efficiently  formulate  the  computations  in  both  active  monostatic  and  passive 
multisensor  processing  (applicable  to  communications  and  remote  sensing  systems).  The  primary 
motivation  is  to  provide  efficient  transform  (wavelet)  domain  processing  for  wideband  (high  time- 
bandwidth  product  or  spread  spectrum)  signals,  analogous  to  the  efficient  Fourier  transform  domain 
processing  for  narrowband/stationary  (small  time-bandwidth  product)  signals. 

RECEIVER  PROCESSING 

Typical  receivers  in  communications  and  remote  sensing  systems  employ  simple  correlators  as  their 
core  processor.  The  correlation  receiver  compares  a  "reference  signal"  to  "modified  versions"  of  a 
second  signal.  The  standard  correlation  receiver,  referred  to  as  the  narrowband  receiver  in  this 
papier,  correlates  a  received  signal  with  a  time  delayed  and  Doppler  frequency  shifted  version  of 
either  the  transmitted  signal  (in  the  active  case)  or  a  second  received  signal  (in  the  multi-sensor 
passive  case).  This  receiver  performs: 

NBRec^^^{z,o^) -fr(t)  o*(t-r) 


on  r(t)  and  o(t)  which  are  the  two  signals  being  processed,  and  hypothesizes  many  time  delays  and 
many  Doppler  shifts  on  o(t).  At  the  correctly  hypothesized  delay  and  Doppler,  the  geometry  and 
motion  are  properly  accounted  for,  and  the  receiver  has  a  high  response  or  achieves  a  peak  in  its 
output. 

For  the  high  time-bandwidth  product  signals,  an  alternative  receiver  accounts  for  time  scaling  rather 
than  just  Doppler  shift.  This  "wideband"  correlation  receiver  performs: 
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wmec„,(a,  b)  =  //•(  t)  o-{  dt 


where  the  time-scale  hypthesis  is  "a"  and  the  time  translation  parameter  is  "b."  This  is  the  receiver 
that  is  primarily  addressed  in  this  paper. 

WAVELET-BASED  PROCESSING  TO  EFFICIENTLY 
ACHIEVE  BROADBAND  MONOSTATIC  AND/OR  PASSIVE 
CROSS-SENSOR  PROCESSING 

For  many  current  applications  involving  satillite  assets,  the  combination  of  rapid  motion  (of 
this  sensor/emitter/transciever)  and  high  time-bandwidth  product  signals  leads  to  conditions 
that  require  signals  to  be  time  scaled,  rather  than  simply  Doppler  shifted,  to  achieve  high 
processing  gains.  If  the  combination  of  both  the  signal  characteristics  (high  time-bandwidth 
product  signals)  and  the  sensor/source  geometry  (sensor  separation  and  sensor  and/or 
emitter/reflector  motion)  is  such  that  different  Doppler  shifts  must  be  used  on  different 
frequency  bands  to  maintain  coherence  across  the  entire  signal  spectrum,  then  the  signal 
processing  could  utilize  time  scaling  to  maintain  coherence  across  the  duration  of  signal 
(essentially  accounting  for  the  rapid  motion  of  the  sensor/emitter/transceiver  with  the 
processing).  These  conditions  are  mathematically  stated  by  the  "narrowband  condition;"  this 
condition  places  the  limits  on  the  parameter  to  determine  when  a  single  Doppler  shift  can 
account  for  the  motion  over  the  entire  duration,  T,  of  the  signal  with  bandwidth  BW  (and  a 
maximum  relative  speed  of  v  and  a  speed  of  light  of  c).  If  this  condition  is  satisfied,  a  single 
Doppler  shift  will  sufficiently  account  for  the  relative  sensor  motions  ([Van]  and  many  other 
references,  refer  to  [You]  or  [Wei]  for  other  references): 


Iv  1 
c  ^  TVBW 


or 


^  IvyBW 


By  accounting  for  the  true  time-scaling  of  the  signal  due  to  the  relative  motions,  the  interval 
over  which  coherent  processing  can  be  achieved  is  significantly  increased.  The  relative  speeds 
can  be  completely  accounted  for  by  applying  the  proper  time-scaling  to  the  signals;  thus,  the 
limitations  on  the  processing  are  imposed  by  the  relative  accelerations,  acc,  instead  (note  that 
linear  velocities  will  cause  relative  acceleration  if  sensors/emitters/transcievers  are  not  moving 
directly  at  one  another).  The  "first  order  time-varying"  condtion,  typically  applied  to  high 
time-bandwidth  product  signals,  identifies  the  duration  over  which  an  applied  time-scale  will 
account  for  the  relative  motion  and  maintain  the  coherence  of  the  entire  signal.  This 
condition  is: 
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T<  ^  ^  — 
acc  y±sW 

Note  that  the  valid  processing  duration  will  be  significantly  larger  when  the  time-scaling  is 
applied  to  the  signals  rather  than  a  single  Doppler  shift. 

The  time  scaling  operation  that  is  required  in  the  broadband  processing  is  an  extremely 
computationally  intensive  operation  when  performed  with  the  standard  multirate  filtering 
[Vai].  When  geometries  are  not  precisely  known  or  remote  sensing  is  being  performed,  many 
time-scale  hypotheses  are  necessary.  Massive  computational  demands  result  from  the  required 
fine-grained  time  scaling;  the  fine  time-scaling  is  necessary  because  of  the  fine  time-scale 
resolution  inherent  in  these  high  time-bandwidth  signals.  In  multirate  filters  a  fine  time  scale 
change  or  resampling  of  the  digital  signal  is  accomplished  by  upsampling  and  lowpass  filtering 
(or  interpolating)  and  then  downsampling  (decimating);  the  upsampling  and  decimation  rates 
are  integer  numbers  but  are  different  integers  (if  they  were  the  same  integer,  the  time  scale 
would  be  one  and  the  signal  would  be  unchanged).  For  example,  assume  that  a  time  scale 
accuracy  of  0.0001  was  required  to  achieve  a  desired  resolution.  With  a  multirate  filter  the 
resampling  would  require  a  ratio  of  10001  to  10000  which  is  1.0001  in  time  scale.  Although 
this  is  possible,  it  is  very  inefficient  for  many  fine  scales,  especially  if  the  possible  time  scales 
cover  a  broad  and  continuous  range.  The  multirate  filters  achieve  efficiency  by  using  several 
stages  of  filters  at  different  sampling  rates;  however,  to  achieve  their  efficiency  one  integer  in 
the  upsample/downsample  ratio  must  remain  fixed  and  the  other  integer  must  have  simple 
factors  (preferably  several  factors  all  being  less  than  about  10).  For  a  fine  time-scale  grid, 
without  any  hypothesis  gaps,  these  constraints  on  the  integers  (that  define  the  time  scale 
itself)  are  not  practical.  An  alternative  processor  formulates  the  time  scaling  a  simple/efficient 
operation  "in  the  wavelet  domain;"  it  operates  on  the  signals  wavelet  transforms  rather  than 
directly  on  the  time  signals  or  on  the  spectrums  of  the  signals. 

WAVELET  DOMAIN  TIME-SCALING 

The  primary  theoretical  concept  is  to  map  the  inefficient  time-scaling  operation  from  the  time 
domain  (as  done  with  multirate  filters)  to  the  wavelet  domain.  In  the  wavelet  domain  the 
operation  of  time-scalinp  a  signal  becomes  simply  shifting  the  signal  along  the  scale-axisi 
however,  orthogonal  wavelet  transforms  cannot  be  utilized  for  this  operation,  only 
nonorthogonal  wavelet  transforms  can  be  employed.  Since  shifts  are  simple  or  efficient 
operations,  this  method  of  time-scaling  is  preferable  in  many  (but  possibly  not  all)  processing 
architectures.  Thus,  the  time-scaling  operation  is  achieved  by  a  shift  of  the  wavelet  transform 
of  the  signal  along  the  scale  axis.  For  both  active  and  passive  processing  a  time  delay  must 
also  be  applied  to  one  of  the  signals  besides  the  time  scaling  operation.  Again,  this  time  delay 
operation  can  be  accomplished  in  the  wavelet  domain  as  well.  Time  delay  is  accomplished  by 
a  shift  along  the  time/translation/delay-axis  in  the  wavelet  domain.  Thus  the  two  operations  of 
time  delay  and  time-scaling  can  be  accomplished  by  shifts  along  each  axis  in  the  wavelet 


989 


domain  representation  of  the  signal.  See  Figure  1  where  the  received  signal  is  r(t)  and, 
depending  on  whether  multisensor  passive  or  monostatic  active  processing  is  being  performed, 
o(t)  is  the  "other"  signal  (another  received  signal  or  the  transmitted  replica). 


Hypotheses  &  Geometry 

I  I _ 

Map  to  corresponding 
time  delay  &  time  scale 


r(t)>|WTg[r(t)](a,t 


Freedom  in 

Shift  a-axis  by  scale 

choosing  mother 

and  b-axis  by  delay 

wavelet,  g 

0(t)  >  WTg[0(t)](a 

"1 

mull 

(  )dadb  >J  |,|^ 


Where  WTg  is  the  wavelet  transform  operator  (transform)  with  respect  to  the  "mother"  or 
analyzing  wavelet  function,  g(t),  and  the  "a"  and  "b"  parameters  are  the  time-scale  and  time- 
translation  parameters,  respectively  .  If,  under  special  circumstances,  the  time  delay  and  time- 
scale  grid  of  hypotheses  are  "uniform"  (equal  grid  spacing  between  each  delay  hypothesis  and 
similarly  for  the  time-scale  hypothesis),  the  operations  of  shifting  the  wavelet  domain 
representation  and  performing  the  multiplications  and  sums  can  all  be  combined.  The 
combination  of  these  operations  acting  on  the  two  wavelet  domain  representations  fboth  2D 


signal  representations)  is  2D  convolution.  Under  this  restrictive  assumption  many  fine  time 
delay  and  time-scale  hypotheses  can  be  simultaneously  and  efficiently  evaluated  -  utilizing 
efficient  2D  convolutional  techniques.  See  Figure  2. 
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A  significant  additional  feature  of  this  wavelet  domain  processing  is  that  the  signals  have  this 
intermediate  wavelet  domain  representation.  As  discussed  in  [You],  the  wavelet  domain 
representation  is  sensitive  to  the  chosen  mother  wavelet.  If  the  mother  wavelet  is  chosen 
properly,  it  could  provide  significant  classification  features  regarding  the  environment  being 
sensed.  So,  this  intermediate  wavelet  transform  representation  could  be  a  significant 
byproduct  for  the  classification  process. 

The  mathematics  that  justify  and  formulate  this  concept  follow.  Several  more  detailed 
derivations  are  in  [You,  Youl,  You2].  The  wavelet  domain  wideband  receiver  is  formulated 
as: 

where  Wtg[r(t)]  is  the  two-dimensional  function  (the  wavelet  transform  of  r(t))  and  the 
arguments  or  the  independent  variables  of  are  in  the  brackets  [  ].  The  intuitive  effects  of 
these  equations  are  more  easily  interpretted  by  the  earlier  figures.  The  presentation  will 
provide  examples  of  these  computations  and  quantify  the  computational  savings  for  those 
particular  examples  on  a  praticular  architecture. 
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CONCLUSIONS 


In  many  newer  communications  and  sensing  systems  the  utilization  of  broadband,  high  time- 
bandwidth  product  signals  mandates  the  more  general  time-scaling  to  be  utilized  in  the  place 
of  Doppler  shifting.  Nonorthogonal  wavelet  transforms  are  utilized  to  efficiently  formulate 
the  computations  in  both  active  monostatic  and  passive  multisensor  processing  (applicable  to 
communications  and  sensing  systems).  The  efficiency  gains  are  very  significant  but  are  also 
architecture  dependent.  The  presentation  will  provide  examples  of  these  computations  and 
quantify  the  computational  savings  for  those  particular  examples  on  a  praticular  architecture. 
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Abstrnci — Ortlionormal  wavolcts  have  been  successfully  used 
as  basis  and  testing  functions  in  the  iiittjgral  equations  to  re¬ 
place  the  pulse,  triangular,  and  PWS  (pirewi.se  sinusuidal) 
functions.  Very  sparse  coefficient  matrices  have  been  ob¬ 
tained  due  to  the  vanishing  moments,  localization,  MR.A 
(inultiresoliitioii  analysis),  of  the  wavelets.  However,  in 
many  practical  problems,  the  .solution  domain  is  confined 
in  a  bounded  interval,  while  the  wavelets  are  defined  on  the 
entire  real  lino.  To  overcome  this  problem,  periodic  wavelets 
are  introduced.  Nonetheless,  the  unknown  functions  imist 
take  on  equal  values  at  the  endpoints  of  the  bounded  inter¬ 
val,  in  order  to  apply  periodic  wavelets  as  the  b^sis  func¬ 
tions.  This  requirement  has  limite<l  the  applicability  of  the 
periodic  wavelets.  In  particular,  when  three  dimensional 
problems  arc  considered,  the  unknown  luust  have  a  con- 
■staiit  value  at  its  contour  boundary,  which  is  too  restrictive. 
In  this  paper  we  present  a  new  approach,  employing  the 
intervallic  wavelets.  The  intervallic  waveltos  form  an  or- 
thoiiormal  basis  and  preserve  the  same  MU  A  of  other  usual 
nnboniided  wavelets.  No  requirement  for  the  endpoints 
values  are  imposed  if  the  unknown  function  is  expanded 
in  terms  of  intervallic  wavelets.  No  biortliogonal  b;isis  is 
needed.  Hence,  the  intervallic  wavelets  are  very  versatile  to 
apply  to  the  surface  integral  equations  where  the  miknowns 
are  defined  on  spatial  surfaces,  whicli  are  bounded  by  line 
contours.  There  is  no  need  for  the  unknowns  to  take  a  con¬ 
stant  value  on  the  boundary  contour.  The  coiistnictioii  of 
the  intervallic  wavelets  is  presented.  Numerical  examples  of 
scattering  and  guided  wave  problems  are  discussed. 


I.  INTMODDCTION 

Recent  ly,  a  new  cat  egory  of  ortliogoiial  sy.steins,  iiani(?!y  or¬ 
thogonal  wavelets,  lias  emerged  [1],[2].  In  computer  vision 
and  signal  processing,  wavelets  have  liecome  a  hot  topic 
[.3]  mainly  ilue  to  the  mnllire.solnl  ion  analysis  (MR.A)  ami 
the  localization  properties  in  liotli  space  am!  frequency  <lo- 
mains.  Orthogonal  wavelets  also  have  many  fascinating 
l>ropei'ties  for  electromagnetic  field  computations,  first., 
wavelets  are  sets  of  orthonormal  ha.ses  of  L-(R.).  1  lioy  are 
prohlem-iiidepeiident  orthogonal  bases  and  thus  are  suit¬ 
able  for  numerical  computations  for  general  cases.  Second, 
the  Irade-olT  between  the  ortliogoiiality  and  coiitimiity  is 
well  balanced  in  orthogonal  wavelet  systems  because  now 
the  orthogonality  always  holds  whetiu'r  the  supiiorliiig  re¬ 
gions  arc  overlaj'iped  or  not.  One  can  build  an  ortimgo- 
iial  wavelet  system  with  any  order  of  continuity,  expecting 
larger  supporting  regions  as  higher  orrier  of  continuity  is 
.selected.  Third,  in  addition  to  the  advantages  of  the  tradi¬ 
tional  orlliogonal  basis  systems,  orthogonal  wavelets  have 
zero  monn.'iits  such  that  there  is  much  more  certaintv  to 
yield  sparse  systems  of  linear  algebraic  eqnatioiis  [4].  Pur- 
tliermore,  orthogonal  wavelets  have  localization  properties 
in  both  the  space  and  frequency  domains.  Therefore,  tlie 
decorrelation  of  the  ex|mnsion  coefiicieiits  occurs  both  in 


tiie  space  and  Fourier  domains.  Nevertheless,  according 
to  the  theory  of  multigrid  processing  [5],  one  can  improve 
convergence  by  operating  on  both  fine  and  coarse  grids  to 
reduce  both  the  high-frequency  and  low-frequency  compo¬ 
nent  errors  between  the  approximate  and  exact  solutions 
ill  contrast  to  the  traditional  way  of  operating  only  on  fine 
grids  to  reduce  the  high-frequency  component.  The  expan¬ 
sion  with  subsectional  ba.ses  actually  is  equivalent  to  the 
expansion  on  the  finest  scale  only.  On  the  contrary,  the 
nmltiresolntion  analysis  implemented  by  wavelet  expansion 
j)rovides  a  multigrid  method.  Finally,  llie  pyramid  scheme 
employed  in  the  wavelet  analysis  provides  fast  algorithms 

HI- 

VVevelets  have  been  successfully  used  in  electromagnetics 
to  solve  rc.sonaiice  and  interference  problems  using  com¬ 
pactly  supported  Dauhechies  wavelets  [6].  To  extend  the 
domain  of  the  ID  wavelets  from  the  real  line  to  curves 
and  closed  contours,  the  boundary  element  method  has 
been  combinetl  with  wavelets  [7].  Tlio  Fast  wavelet  algo¬ 
rithm  was  projiosed  [4],  and  applied  to  LM  problems  [8] 
to  reduce  the  computational  effort  of  the  inner  product 
computations  for  the  coefficient  matrix.  For  more  effec¬ 
tive  treatment  of  the  end  points  of  an  interval,  on  whicli 
the  unknown  is  defined,  periodic  wavelets  and  biorthogo- 
nal  wavelets  have  been  employed  [8].[9].  Although  the.se 
two  approaches  have  improved  the  efficiency,  each  has  its 
limitations.  For  instance,  in  order  to  use  periodic  wavelets, 
the  unknown  must  have  equal  values  at  the  two  endpoints 
of  the  interval.  This  constraint  becomes  more  restrictive 
wlien  t  he  unknown  is  defined  on  a  surface.  In  this  case  the 
unknown  must  take  eipial  values  on  the  contour  on  whicli 
the  surface  is  spanned. 

In  this  article  we  introduce  the  intervallic  wavelets  winch 
have  released  the  above  constraint. 

II.  BASIC  WAVELET  THEORY 

2.1  Scaling  Function 

A  miiltiresoliition  analysis  of  L'C/f)  is  defined  as  a  seipience 
of  closed  snhspaces  Vj  of  L~(R),  j  G  ^  ,  with  the  following 
properties: 

V';  C  +  i 

(  ■(^O  G  V  )  G  V  j  +  i 

!'(■<•)  €  V'o  c(x  +  1)  €  Fo 

nL  =  NLU'i  =  ''^<«) 
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A  scaling  function  yclj:)  6  lo,  with  a  non-vanishing  inte¬ 
gral.  exists  such  that  the  collection  {^(t  -  /)  |  /  G  is  a 
Kiesz  basis  of  I  n- 

■Since  ^  G  V'o  C  V'l .  a  seiiiicMice  (/)*•)  €  l-[Z)  exists  such 
that  the  scaling  function  satisfies 

^[x)  ^  v/2 

I- 

I'lii.s  functional  ec}nal  ion  is  referred  to  the  dilation  equa¬ 
tion,  where  {/ij.}  are  tlie  coefficients  of  the  lowpa-ss  filter 
and 

E''-  =  > 

k 

It  IS  iniinediate  that  the  collection  of  functions  |  /  G 

Z),  with 

=  Vl--^(‘Vx  -  1) 

is  a  Riesz  basis  of  1} 

2. 2  Wav  i: LETS 

We  will  use  lij  to  d('note  a  sjiace  conipleinenting  V)  in 
Ij  +  i.  i.e.  a  s|)ace  that  satisfies 

V}  +  1  = 

and 

0iL  =  e'dfi 
} 

A  function  il'  is  a  wavelet  if  the  collection  of  functions 
-  /)  1  /  G  Z}  is  a  Riesz  basis  of  IV'o.  The  collec¬ 
tion  of  wavelet  functions  (d'j,;  |  G  Z}  is  then  a  Hic.sz 
basis  of  The  definition  of  C;,!  is  similar  t.o  that  of 

■fji-  Since  the  wavelet  i’’  is  an  idi'nient  of  Iq,  a  sequence 
('Ik)  G  /'(Z)  exists  such  that 

=  \/2  ^  (/i(.y''(2.r  -  h) 
h 


-  5]  -f  /)  for  0  <  /  <  2J  and  j  >  0 

lez 

If  the  support  of  ipj^k  is  a  subset  of  [0.  Ij.  then  = 

Gj, *■(•'■)■  Otherwise,  is  choiiped  into  pieces  of  length 

1,  which  are  shifted  onto  [0,1]  and  added  up,  yielding 
rYk^Z')-  Tliis”wrap  around”  procedure  is  satisfactory  in 
many  situations.  However,  unless  the  behavior  of  tlie  func¬ 
tion  /  at  0  matches  that  at  1,  the  periodic  version  of  f  has 
a  singularity  there. 

IV.  IN'rFvRVALLlC  WAVELR'I’ 

Standard  wavelet  analysis  involves  constructing  basis  for 
collections  of  functions  on  the  real  line  R  such  as  the  sciuare 
inli'grabte  functions  on  real  line,  L‘(R).  For  many  appli¬ 
cations  it  is  nece.ssary,  or  at  least  more  natural,  to  work  on 
a  subset  of  the  real  line. 

4.1  .Scaling  Functions 

Let  us  sketch  the  construction  of  orthogonal  wavelets  on 
[(),  1],  which  was  pro|)osed  by  13.  ,lawertii[l(),ll].  Start  from 
an  orthogonal  Coifman  scaling  function  witli  ();V  non-zero 
coelficients,  and  a.ssume  the  scale  is  fine  enough  so  tliat 
the  endiioints  are  independent.  All  polynomials  of  degree 
<  2A'  can  be  written  as  linear  combinations  of  the  q.  for 
h  G  Z,  with  coefficients  are  polynomials  of  degree  <  2/V. 
Hence,  confined  the  wdndowcd  parts  of  polynomials  on  [0,  1] 
are  in  Since  the  {<p,^k}  an  orthonormal  basis  for 

V},  any  monomial  j'q  <  2iV  —  1,  has  the  reiu'esentation 

k 

'I'he  restriction  to  [0,  1]  can  then  be  written 

IN  2'-'!/V  2-’  +  2jV 

•'■“ln>.il=(  1]  +  5!^  +  '> 

A.—  --LV  +  2  *.--2A'-f-l  t=2.'-LV+I 
<  1(1), 1] 


the  wavelets  are  orthogonal  wavidets,  satisfying 
=  ( -1  )* 

111.  PERIODIC  WAVELETS 
So  far  we  liave  been  discussing  wavelet  theory  on  the  real 
linc'.  For  man v  a[>plical ions,  t.he  funct  ions  involved  are  only 
defined  on  a  compact  set.  such  as  an  ititerval  or  surface.  In 
order  to  appiv  wavelets,  some  morlificat  ions  are  reejuired. 
Consider  a  ()eriodic  function  with  period  1,  i.<-.,  /(j'-f  1)  = 
/(t),  then  the  ueivelet  coidlicienls  on  a  given  scale  satisfy 
<  /,vyq.  >  =  <  /,e.jq.  +  .;  >,  h  G  Z,  atid  j  >  0.  A  periodic 
.MR A  on  the  interval  [0,  1]  can  Ix'  constructed  by  periodiz- 
ing  the  ba.sis  functions  as  Icdiows 

y^Yk  =  XI  Vi, id T  +  /)  for  0  <  i  <  V  and  j  >  0 

tez 


xlk  =  2^"  +  '/-'  Y,  <  Vj,i  >  V2,i(-f)  [[0,1] 

/t-  =  --LV  +  2 

and 

2^-f2.V 

=  2'^' X  <  i'”.  V;.i  >  l[0.t] 

Ji=2J--1A'-f-l 

where  subscript  L  and  R  represent  left  and  right. 

Ih'Vice 

2J/“(2^d,-)'*  =  xl,+ 

2’--LV 

X  <  Vi,i  >  Vi,i(-'’)  l[o,i] 

i  =  2;V-f  1 

Deline  the  spaces  ,  j  >  jo,  to  be  the  linear  span  of  the 

funct  ions  (j.'"/  <2, V-l,  {‘'■j^  yJa<2A'- U  Wi,k  l[0,  l] } I- =  2/^+ 1  ' 

namely 

iVi.i  llo.ijll^Llilv^-i  U  /?}«<2N-1 
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Tlie  collections  >  {‘^7.fi}a<2A'- 1 1  l[o,i) 

}r=2N^+i  mutually  orthogonal,  from  previewsj^onstruc- 
tiou,  all  polynomials  of  degree  <  ‘2N  —  1  arc  in  Vj ,  and 

it  can  be  proved  that  Vj  form  an  MRA  of  f/'([0,  1]). 

All  of  the  functions  in  collections  are  linearly  independent, 
and  can  be  used  as  basis  function.  In  order  to  form  an 
orthonormal  basis,  we  only  have  to  ortliogonalize  the  func¬ 
tions  Xj  I  and  Xji^. 

4.2  OllTUOCiONAMZATION 

More  specifically,  let  us  consider  the  left  endpoint  and  set. 
2jV-1 

^3  =  0 

The  2N  X  2N  matrix  A  — 

For  a  symmetric  matrix  A'  with  each  element 

the  orthonormality  condition  is 

/  =:  .4A.4* 

Now  note  that  A'  is  positive  definite  and  symmetric,  hence, 
the  Cholesky  decomposition  holds,  namely  A  =  CC’ .  1  he 
choice  of 

A  =  C-' 

that  is,  we  have  proven  that  the  functions  in 

are  orthonormal.  Similarly,  we  perform  the  ortliogonaliza- 

tion  of  Xj  i^. 


4.3  Wavelets 

To  obtain  to  the  correspoiuhng  waAolets  we  let  IV  j  be  the 
orthogonal  complement  of  Vj  in^'j  +  i-  the  wavelet 
with  3/V  <  k  <  2-'  -3A'  are  all  in  Fj  +  i  and  confine  entirely 
inside  [0,  1].  The  remaining  6A  functions  recpiired  for  an 
orthonormal  basis  of  ITj ,  can  be  found  by  using 

V’f  +  EI  “  X/  V^J  +  'd’ >  PjA  + 

k 

+  >  V’jA- 

k 

V.  SOLVING  INTEGRAL  EQUATION 
In  this  section  w'c  apply  the  intervallic  scaling  functions 
and  wavelets  iu  solving  the  integral  eipiation 

J  f{x')I\{x,x')(lx'  =  >i(x) 

5.1  Expansion  in  Terms  of  Intervallic  Wavelets 
With  the  domain  of  integration  being  [0,  1],  let  ns  expand 
the  unknow'ii  function  /  in  the  integral  equation  in  terms  of 


the  scaling  functions  and  wavelets  on  the  bounded  interval 
as  _ 

/(•»0  = 

k 

k  i>}0  t 

Here  {  A  term  represents  bandpass  filter  characteristics, 
and  extract  succe.ssively  lower  and  lower  frequency  compo¬ 
nents  of  the  unknown  function  with  decreasing  values  of 
the  scale  parameter  j  ,  while  {p'„A  indicates  low- 

pass  filter  characteristics,  and  retains  the  lowest  frequency 
components  or  the  coarsest  approximation  of  the  original 
function. 

The  second  expan.sion  of  /  is  substituted  in  the  integral 
ecpiation,  the  resultant  equation  is  tested  with  the  same 
set  expansion  functions.  As  a  result,  a  set  of  linear  equa¬ 
tions  is  formed 


.4  =: 


A^^^A 

A^.^^A 


X  = 


B  = 


<  f/.  A'.k‘  >j‘A‘  J 


where 

4,,,^  :  =  <  p‘^,k<Af'KPj„.k)  >k  k' 

A^^rp.  :=<  >j,fc.E' 

4v,v3  <  Aj,  ).'<{^KPj,,,k)  >k.j‘,k' 

4,;,,^,  :  =  <  Aj',k‘A^K  A,k)  >j.k.j\k' 

<f,9>=  I  HAijAAj- 

Jo 

(La/)(x)-  /  f{x')K{x,x')dx' 

Jo 

A,^  and  A,;  are  vector. 

5.2  Numerical  Integration 

The  evaluation  of  the  coefficient  matrix  entries  involves  nu¬ 
merical  integrations,  which  are  time  consuming.  However, 
by  taking  the  advantage  of  the  vanishing  moments  and 
compact  support  of  the  wavelets,  many  entries  can  directly 
identified  or  calculated  without  performing  the  quadrature 
procedures.  Away  from  the  singular  points  of  the  ker¬ 
nel,  the  integrand  behaves  as  a  polynomial  locally.  Con¬ 
sequently,  the  integral  that  contains  at  least  one  wavelet 
function,  as  the  basis  or  the  testing  function,  will  result  in 
zero.  In  the  mean  while  the  integral,  that  contains  scaling 
functions  as  basis  and  testing  functions,  wull  take  the  zero 
order  moment  of  the  kernel.  For  those  integrals  in  which 
the  basis  and  testing  functions  overlaps,  and  therefore  the 
kernel  singular  point  lies  within  the  integration  interval,  the 
numerical  integration  has  to  be  conducted.  Lven  though 
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tho  integration  limits  range  from  0  to  1,  the  intervals  of  ac¬ 
tual  integrations  are  much  smaller  l>ecause  of  the  compact 
support  of  the  intervaliic  father  and  mother  wavelets. 


VI.  NUMCRlC.4h  EXAMPF.FS 
Shown  in  Fig.  1.  is  a  perfectly  conducting  cylinder,  wliiclt 
excited  by  an  im})re.ssed  electric  (ii'ld  The  induced 

current  J,  on  the  conducting  cylinder  produce  a  scat  tered 
fndd  'The  boundary  condition  is 

E,  —  E\  +  FA  =  0  on  (.' 

that  is,  the  tangential  electric  field  vanishes  on  C,  the  con¬ 
tour  of  the  ellipse,  Hence,  we  have  the  Integral  eipiatioii 

E\  •/-•(// I  P  “  P'  P 

where  E\[p]  is  known  and  J~  is  the  unknown,  is 

Hankel  function  of  the  second  kind,  zero  order,  k  = 

j;  Ri  120a-. 

If  incidtuit  field  from  tlie  direction  <l>i-  F*  is  gieen  hy 

jpi  _  ^ j  K(  r  cos(  , )  +  ;/ SII|(  C>, )) 

A  parameter  of  interest  is  (he  scattering  cross  section  cr, 
defined  as  the  width  for  which  tlu'  incident  wave  carries  suf¬ 
ficient  power  to  produce,  by  omnidirectinal  radiation,  the 
same  scattered  power  density  in  a  given  direction.  Mathe¬ 
matically.  this  is 

<t{o)  =  27r/r  I  —y—  \ 

where  FA{d)  is  the  di.staiit  field  from  T- .  It  can  In'  found 
l)v  using  the  asymiitol ic  expression  for  'I'he  result  is 

=  r/a-A'  /  + 


wtiere 


and 


1<{P) 


'I'lii.s  can  be  evaluated  numerically  once  J:  is  found. 

Using  the  procedures  described  in  section  U, corresponding 
matrix  ecjuations  ;ire  solved  for  a  elliptic  cylindrical  surface. 
Fig.  2,  .‘1,  4  and  5  show  the  surface  current  distribution  and 
radar  cross  section  bv  conventional  MoM  and  this  method, 
d'lie  results  of  the  conventional  MoM  and  (his  method  agree 
very  well. 


Vll.  CONCLUSIONS 

In  (his  paiier,  the  intervaliic  wavelets  are  constructed  and 
applied  to  the  solution  of  lioundary  int<'gral  ecpiations  for 
electromagnetic  proldems.  in  which  the  unknown  luiictions 
are  defined  on  a  finite  inltuvai.  Numerical  examples  are 
provided.  The  results  agree  well  with  the  moment  method 
.solution.s. 
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Abstract 

To  help  reduce  the  data  processing  burden  during  target  identification  from  radar  cross 
section  data,  many  time-frequency  transforms  were  investigated.  It  is  found  that  some  transforms 
show  more  potential  than  others.  The  Daubechies-6,  Daubechies-20,  and  Mallat  transforms  yielded 
coefficients  which  clustered  in  a  useful  manner,  while  the  STFT,  FFT,  Choi- Williams,  and  Wigner- 
Ville  transforms  did  not  perform  as  well. 


Introduction 

There  is  currently  an  increased  interest  in  using  radar  cross  section  (RCS)  data  for  target 
identification.  To  make  a  correct  identification  from  the  early  time  response,  a  large  amount  of  radar 
data  has  to  be  analyzed.  The  processing  of  such  a  large  amount  of  data  makes  the  target 
identification  process  both  costly  and  slow.  Consequently,  there  is  much  ongoing  work  on  ways  to 
reduce  the  amount  of  data  that  needs  to  be  processed.  Recently  Rothwell  et  al  fl],  proposed  using 
the  Discrete  Wavelet  Transform  to  reduce  the  data  storage  requirement  for  target  discrimination.  In 
this  paper,  we  investigate  various  other  time-frequency  transforms  for  their  potential  use  in  the  data 
reduction  process. 


Data  Acquisition 

The  data  to  be  processed  consist  of  the  measured  RCS  from  two  model  aircraft:  an  F-14 
and  an  A- 10.  The  measurements  were  taken  in  a  direct  illumination  tapered  anechoic  chamber  at  the 
US  Air  Force  Academy.  Measurements  were  made  with  a  vertically  polarized  incident  wave  in  a 
stepped  frequency  CW  mode.  Each  target  was  measured  at  a  single  elevation  in  a  360°  azimuth  scan 
at  0.5°  increments  (giving  721  observation  angles).  For  each  0.5°  increment,  801  frequency  samples 
were  taken  equally  spaced  over  a  6-18  GHz  band.  Each  target  measurement  produced  a  801  X  721 
(577,521  element)  mati-ix  of  complex  RCS  values.  This  huge  data  set  was  then  transformed  to  the 
time  domain  via  a  Chirp-Z  IFFT  and  time  gated  between  ±  2.5  nsecs.  Gating  was  used  to  better 
i.solate  the  targets  from  their  surrounding.s.  It  also  reduced  the  data  set  to  a  512  X  721  matrix  of 
complex  values  for  each  target.  The  gating  window  was  dimensioned  so  that  it  had  negligible  effect 
on  the  data. 
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Transform  Applications 


The  time  domain  measurements  were  analyzed  using  time-frequency  transform  tools  m  order 
to  find  characteristic  features  and  to  reduce  the  data  size.  It  is  important  to  notice  that  in  an 
identification  problem,  the  primary  goal  is  not  to  reconstruct  the  original  signal  using  a  smaller  set  of 
data,  but  to  find  proper  parameters  for  each  target  in  order  to  differentiate  them.  Most  of  the 
information  that  is  kept  for  reconstruction  is  irrelevant  for  identification,  since  the  problem  to  be 
solved  is  a  classification  problem,  which  is  less  demanding  than  a  matching  problem.  Consequently, 
we  can  discard  information  during  the  identification  process,  but  cannot  do  so  during  signal 
reconstruction. 

Furthermore  the  larger  the  data  set,  the  more  susceptible  it  is  to  environmental  disturbances. 
The  ability  of  each  element  of  the  data  set  to  discriminate  between  targets  decreases  as  the  size  of  the 
data  set  increases,  since  then  the  identification  process  is  spread  over  a  larger  number  of  parameters. 
Thus  each  parameter  makes  a  small  contribution  towards  target  identification.  This  is  in  contrast  to 
smaller  data  sets  in  which  the  parameters  are  more  robust  and  not  as  sensitive  to  environmental 
disturbances  such  as  noise  and  variations  in  the  observation  angle. 

From  this  point  of  view,  the  data  analysis  can  be  done  using  various  transforms,  in  order  to 
find  a  few  relevant  coefficients  which  can  be  combined  in  the  decision  process.  The  observation  angle 
is  unknown  when  the  identification  has  to  be  made,  but  the  radar  which  is  used  provides  information 
(speed  vector,  acceleration,  speed  vector  rotation,  etc.)  that  allows  us  to  define  an  angle  range  in 
which  it  is  located.  It  is  then  important  to  find  features  that  are  maintained  for  several  consecutive 
observation  angles.  In  other  words,  clustered  coefficients  must  be  found. 

In  the  following  part  of  this  paper,  several  transforms  are  illustrated.  Considering  the  data 
size,  we  can  classify  the  transforms  into  two  families.  The  first  one  includes  the  transforms  that  do 
not  increase  the  data  size:  for  each  512X1  input  vector,  the  output  is  a  512  X  1  coefficient  rnamx. 
The  FFT  and  the  Wavelet  Transforms  belong  to  this  family.  With  our  data,  they  yield  a  512  X  721 
output  matrix.  The  second  family  increases  the  amount  of  data:  with  a  512  X  1  input  vector,  the 
output  is  a  512  X  N  matrix,  with  N  greater  than  1.  STFT,  Wigner-Ville  and  Choi-Williaias 
transforms  belong  to  this  family,  and  were  applied  to  selected  vectors  (512  X  1)  that  represent  the 
time  domain  RCS  at  particular  observation  angles.  All  three  transforms  used  a  32-point  sliding 
window.  The  Wigner-Ville  and  the  Choi-Williams  yielded  a  512  X  32  matrix  of  coefficients,  while 
the  STFT  yielded  a  512  X  512  matrix  of  coefficients  for  each  512X1  input  vector. 

The  Wavelet  Transforms  were  evaluated  using  the  Rice  University  Toolbox  [2].  For  an  input 
vector  f,  the  result  is  stored  in  a  vector  W  as; 

W  =  [H'^  f;  GH'^  ’f;  GH‘'''f; ...;  GHf;  Gf] 

where  H  is  a  vector  containing  the  low  pass  filter  coefficients  which  determine  the  dilation  equation. 
G  is  a  vector  containing  the  band  pass  quadrature  mirror  filter  coefficients  constructed  from  H,  and 
establishes  the  wavelet  equation  \|/(x)  in  terms  of  the  scale  function  (p(x).  The  dilation  equation  Is 
given  by: 
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(pU)  =  V2I//(/)(p(2A--0 

and  = 

y  (.r)  =  ^/21G(/){p(2A--/) 


Results 

In  the  discussions  that  follow,  we  use  the  word  order  to  describe  coefficients.  By  "order"  we 
mean  the  rank  of  the  coefficient,  i.e.  we  take  the  coefficients  and  rank  them  in  descending  order  of 
magnitude,  so  the  I^  order  coefficient  means  the  largest  coefficient. 

Figures  1  shows  the  azimuth-frequency  plane  plots  for  the  magnitude  of  the  f'  order  FFT 
coefficients.  When  the  entire  azimuth-frequency  plane  plots  of  all  the  coefficients  for  the  FFT  were 
examined  simultaneously,  no  real  clustering  occurred.  The  FFT  coefficients  representing  the  F-14  did 
not  form  a  cluster  which  was  distinct  from  that  formed  by  the  coefficients  representing  the  A- 10. 
Actually,  the  FFT  coefficients  were  randomly  scattered  all  over  the  azimuth-frequency  plane  for  both 
targets  and  so  were  not  very  useful  for  a  target  identification  application.  Various  other  order  FFT 
coefficients  were  then  examined  for  clustering.  For  example,  we  looked  at  the  2"'’,  3'“',  and  ,312'*’ 
order  FFT  coefficients.  These  provided  no  useful  clustering  for  the  magnitude  nor  the  phase  of  the 
FFT  coefficients. 

The  Daubechies-6  coefficients  were  evaluated  next.  The  Daubechies-6  transform  was  applied 
separately  to  the  amplitude  and  the  phase  of  the  time  domain  RCS.  The  amplitude  of  the  RCS 
yielded  coefficients  of  some  .specific  orders  which  clustered  in  a  useful  way  as  shown  in  Figures  2 
through  4.  For  example,  Figures  2  through  4  show  the  clustering  for  the  amplitudes  of  the  2"'',  3'^^, 
and  4*  order  Daubechies-6  coefficients.  Since  the  amplitudes  of  the  above  coefficients  cluster  so 
distinctly  for  various  angle  indices  (which  correspond  to  certain  ob.servation  angle  to  the  aircraft), 
the.se  coefficients  can  be  used  to  distinguish  between  the  F-14  and  the  A- 10. 

As  shown  by  Figure  2(a),  the  F-14  has  a  large  number  of  coefficients  clustered  around  the 
frequency  index  2  for  the  angle  index  ranging  between  600  and  720  (120  to  180  degrees).  Figure 
2(b)  on  the  other  hand,  shows  that  the  A-10  has  a  large  number  of  coefficients  clustered  around  the 
frequency  index  4  for  the  .same  angle  index  range. 

In  Figure  3(a),  the  Daubechies-6  amplitude  coefficients  of  3'^‘‘  order  clustered  around  the 
frequency  index  4  for  the  F- 14  for  an  angle  index  range  of  325  to  425.  For  this  same  angle  range  and 
coefficient,  Figure  3(b)  for  the  A-10  shows  clustering  around  the  frequency  index  6  and  no  clustering 
around  4. 

Similar  arguments  can  be  made  for  the  4“*  order  Daubechies-6  amplitude  coefficients  as 
shown  in  Figures  4(a)  and  4(b).  In  Figure  4(b),  the  A-10  shows  good  clustering  around  the  frequency 
index  3  for  angle  ranges  between  500-550,  and  650-700.  The  F-14  in  Figure  4(a)  show's  clustering 
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around  the  frequency  index  6  for  the  same  angle  range.  So,  when  you’re  in  the  appropriate 
observation  sector,  you  can  easily  distinguish  the  F-14  from  the  A- 10  by  using  coefficients  of  order 
2,  3,  or  4.  There  were  no  distinct  clusters  for  the  coefficients  obtained  with  the  phase  of  the  RCS. 

Azimuth-frequency  plots  of  the  Daubechies-20  coefficients  for  the  amplitude  of  the  RCS  are 
shown  in  Figures  5  and  6.  For  the  3^"  order  coefficients  of  the  F-14  as  shown  in  Figure  5(a),  there  is 
no  clustering  around  the  frequency  index  3  for  an  angle  range  between  0  and  175.  For  the  A-10 
shown  in  Figure  5(b),  the  order  coefficient  clusters  around  the  frequency  index  3  for  the  same 
angle  range.  For  the  4*^  order  coefficients,  there  is  distinct  clustering  for  the  angle  index  range 
between  225  and  325  (-68  to  -18  degrees).  In  this  angle  range,  the  4'"  order  coefficients  for  the  F-14 
are  clustered  around  the  frequency  index  3  while  the  A-10  coefficients  are  clustered  around  9.  The 
Daubechies-20  coefficients  for  the  phase  of  the  time  domain  RCS  did  not  cluster  in  any  useful  way. 

Figures  7  and  8  show  the  azimuth-frequency  plots  for  the  Mallat  coefficients  of  orders  5  and 
6  calculated  from  the  amplitude  of  the  RCS.  The  5"’  order  Mallat  coefficients  for  the  F-14  are 
clustered  around  the  frequency  index  7  for  an  angle  index  range  between  300  to  400  (approximately 
-30  to  20  degrees),  while  those  for  the  A-10  are  clustered  around  the  frequency  index  5.  There  Ls 
also  good  clustering  for  the  6*^  order  coefficients  as  shown  in  Figure  8.  Con.sequently,  the  Mallat 
coefficients  of  order  5  and  6  can  be  used  to  distinguish  between  the  F-14  and  the  A-10.  There  were 
no  distinct  clusters  for  the  Mallat  coefficients  calculated  from  the  phase  of  the  RCS. 

Using  a  32-point  window  for  the  STFT,  the  Wigner-Ville,  and  the  Choi-Williams  transforms, 
we  looked  at  coefficients  for  observation  angles  between  -10  and  -i- 10  degrees.  We  found  no  u.seful 
clusters  when  using  any  of  these  three  transforms.  These  transforms  are  more  difficult  to  u.se  because 
they  increase  the  data  set.  Further  investigation  must  be  done  in  order  to  explore  all  observation 
angles,  coefficient  orders,  and  other  window  sizes. 


Conclusion 

The  study  made  on  RCS  measurements  in  order  to  discriminate  two  aircraft  shows  good 
clustering  for  some  of  the  tran.sforms  used.  For  that  purpose,  the  Wavelet  Transforms  seem  to  be  the 
most  promising,  but  further  investigation  has  to  be  made  on  the  STFT,  Wigner-Ville  and  Choi- 
Williams  transforms.  In  addition,  the  clustered  coefficients  extend  in  approximately  50  degree 
sectors.  Therefore  the  segmentation  of  the  observation  angle  is  compatible  with  the  ambiguity  of  the 
target  orientation  relative  to  the  radar. 
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FIGURE  1:  1-st  amplitude  coefficient  for  the  FFT 
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FIGURE  2;  2-nd  amplitude  coefficient  for  the  Daubechies  6  transform 
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Frequency  inde; 


Angle  index 
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FIGURE  3:  3-rd  amplitude  coefficient  for  the  Daubechies  6  transform 
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FIGURE  4:  4-th  amplitude  coefficient  for  the  Daubechies  6  transform 


1005 


Angle  index  *''9'® 
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FIGURE  7:  5-th  amplitude  coefficient  for  the  Mallat  transform 
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FIGURE  X:  6-th  amplitude  coefficient  for  the  Mallat  transform 
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1.  Introduction 

The  method  of  moments  formulation  has  proven  a  powerful  tool  in  the  antenna  modeling  community.  The 
proliferation  of  low  end  PC  platforms  has  brought  this  modeling  capability  to  the  average  user.  With  this 
continuing  availability,  however,  comes  the  need  for  faster,  less  computationally  intensive  calculaUons. 
With  only  subtle  losses  in  precision,  the  matrix  fill  times  for  many  small  problems  can  be  drastically 
decreased. 

Several  widely  used  codes  which  are  available  to  the  antenna  modeling  community,  such  as  NEC  and 
GEM  ACS,  are  based  on  a  Method  of  Moments  (MoM)  formulation  which  combines  three-term 
trigonometric  basis  functions  with  point  matching.  The  particular  form  used  in  this  technique  to  represent 
the  current  on  segment  n  is  [\] 

I_^(z)  =  A„  t  B^sin  p(z-z^)  -  C^cos  P  (z-z_^)  (1) 


such  that  |z-z„|  <  A/2  where  2^  denotes  the  midpoint  coordinate  of  the  segment  and  A  is  the  segment  length. 
When  computing  the  elements  of  the  impedance  matrix,  certain  integrals  must  be  evaluated  which  are 
related  to  the  constant  term,  the  sine  term  and  the  cosine  term.  The  integrals  associated  with  the  sine  and 
cosine  terms  have  closed  form  solutions.  On  the  other  hand,  because  of  the  presence  of  the  constant  term, 
it  is  necessary  to  evaluate  generalized  exponential  integrals  of  the  form 

A/2 

E:(p,z).  /  (2) 

-A/2 


where 

R  =  ijiz-zf  * 


(3) 


Several  methods  have  been  used  in  the  past  to  compute  this  integral  including  numerical  integration  and 
some  series  representations.  However,  a  rigorous  evaluation  of  the  generalized  exponential  integral,  the 
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only  non-analytic  term  in  the  MoM  formulation,  takes  up  a  significant  portion  of  the  impedance  matrix  fill 
time.  Fortunately,  many  of  the  impedance  elements  have  only  a  small  effect  on  the  moment  method 
solution  and,  therefore,  a  high  degree  of  accuracy  is  not  required  in  their  evaluation.  This  paper  presents 
approximations  to  the  generalized  exponential  integral  which  can  greatly  reduce  the  computational  time 
while  maintaining  a  high  degree  of  accuracy  in  the  MoM  solution.  It  is  shown  that  the  self-impedance  terms 
as  well  as  the  adjacent  terms,  i.e.,  the  terms  separated  by  one  segment  length,  can  be  accurately  and 
efficiently  computed  using  the  first  two  terms  of  a  recently  found  thin-wire  asymptotic  expansion  for  the 
generalized  exponential  integral  [2].  All  other  matrix  elements  are  computed  using  the  far  field 
approximation  for  the  generalized  exponential  integral.  Comparisons  of  input  impedances  for  various  thin- 
wire  antenna  configurations  are  made  with  a  MoM  formulation  in  which  a  more  robust  evaluation  of  the 
generalized  exponential  integral  is  made  Also  presented  are  the  corresponding  matrix  fill  times  for  the 
approximate  and  robust  MoM  codes.  In  addition  to  this,  comparisons  are  made  of  accuracy  and  efficiency 
between  these  codes  and  the  NEC3D  code. 


2.  Impedance  Matrix  Element  Approximations 

Numerical  integration  schemes  may  be  employed  to  evaluate  (2).  However,  this  integral  is  difficult  to 
evaluate  directly  using  numerical  techniques  because  its  integrand  is  sharply  peaked.  One  common 
procedure  for  avoiding  this  problem  is  to  extract  the  "singularity"  from  this  integral,  which  leads  to  [2] 

A/2 

e:(P,^)-F,  .  /  '  ‘  dz-  (4) 

-A/2 


where 


fn 


tn 


C,  >  R, 

Ri  -  C, 


R,  -  C, 


,  C,  ^  0 


(5) 


and 


R 


1 


(6) 


R 


2 


(7) 


Evaluation  of  the  generalized  exponential  integral  (2)  is  often  accomplished  numerically  using  (4)  and  can 
take  a  significant  portion  of  the  total  computational  time  required  for  a  matrix  fill  operation.  Exact  series 
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expansions  offer  an  alternative  approach  for  the  efficient  as  well  as  accurate  evaluation  of  (2).  One  form 
of  an  exact  expression  for  (2)  may  be  derived  by  using  a  Maclaurin  series  expansion  of  the  complex 
exponential  function  contained  in  the  integrand  and  integrating  term  by  term  [2],  The  result  is 

n=0  k=0 


I  R”dC 


in  which  Ci=- A/2-z,  C2=A/2-z,  and  K^=\/z^~r^.  Closed  form  solutions  can  be  obtained  for  the  integrals 
F.i  and  Fq  while  a  recurrence  relation  can  be  used  to  determine  the  higher  order  integrals  F^  for  mi  1  The 
solution  to  F.i  is  given  in  (5),  while  Fo=A  and 

F  .  - - -  Lr”  -  C,Rr  *  “p'eJ  ,  m  ^  1  (10) 

(m  *  1)  ^ 

Another  exact  representation  of  (2)  was  recently  derived  in  [2].  This  expansion  is  simplified  in  [3]  and 
takes  the  form 

eJCp.z)  =  JJPp){e(y/Y2) 


E  —  J„(Pp)Yr-Y"-Yr*Y2 

m  =  l 


|z|  i  A/2 


E;(p,z)  =  J,(pp){n(Y,Y2) 


E  —  j„(pp)  Y^Y^Y^-Y^l .  I"!  ^ 

1  ! 
m  =  1 


Y,  =  n,  ^  -1 


(14) 


V^([z|  .  A/2)’  .  p’ 

■ 

P 

(15) 

v/(|z| -A/2)' . 

Mj  - 

P 

(16) 

For  thin  wires,  a  small  argument  approximation  for  Bessel  functions  can  be  used  to  find  asymptotic 
representations  of  ( 1 1 )  and  ( 1 2)  [4],  The  resulting  asymptotic  expansions  are 

Eo(p,z)  ~  Sq  +  52  -lil-  c  as  pp  -*  0  and  z  =  0 
m  =  l 

(17) 

and 

E°(p,z)  "  .  52  pp  -*  0  and  |z|  ^  A/2 

m  =  l 

(18) 

The  quantities  found  in  (17)  and  (18)  are  given  by 

So  =  2  {n  (o) 

(19) 

bo  =  fn  (yjy.) 

(20) 

•-(H 

(21) 

p 


A  MoM  code  was  developed  using  the  trigonometric  basis  functions  of  (1)  with  extrapolated  continuity 
and  point  matching.  The  code  results  were  compared  using  several  methods  for  evaluating  the  generalized 
exponential  integral  (2)  required  for  the  calculation  of  impedance  matrix  elements.  The  first  option  in  the 
code,  which  will  be  called  the  robust  option,  evaluates  e"  using  (8)  for  the  self-impedance  cases.  All  other 
cases  use  a  three  point  Gaussian  quadrature  numerical  procedure  to  compute  e  “  as  expressed  in  (4).  This 
technique  results  in  extremely  accurate  values  for  the  generalized  exponential  integral  and  the 
corresponding  impedance  matrix  elements 

A  second  option  in  the  code,  the  approximate  option,  evaluates  e”  using  (17)  and  (18).  In  particular,  for 
self-impedance  cases  (z  -  0),  the  approximate  option  computes  the  first  two  terms  in  (17)  and  for  the 
adjacent  mutual  impedances  (|z|  =  A),  the  approximate  option  computes  the  first  two  terms  in  (18)  such 
that 

Eo=2{n(o)'jpao,z  =  0  and  p  =  a  (25) 


Eo  -  fnfY./y.)  ^  j  — (Y.  -  Y.)  ,  |z|  =  ^  and  P  =  a  (26) 

All  other  mutual  impedance  terms  are  computed  by  using  the  far  field  approximation  to  evaluate  e  J  [5] 


Both  thin-wire  options  are  valid  for  computing  the  impedance  matrix  in  the  MoM  formulation  for  wires 
with  radii  adxlO'^A  and  a  segment  length-to-radius  ratio  A/a::8.  The  approximate  thin-wire  option  offers 
increased  efficiency  in  filling  the  impedance  matrix  because  only  analytic  expressions  are  used  in  the 
formulation  while  the  robust  thin-wire  option  uses  a  more  rigorous  and,  consequently,  more  accurate 
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calculation  of  e® 

Table  1  shows  a  comparison  of  the  input  impedance  of  a  half-wave  dipole  for  various  radii  computed  using 
the  two  options  available  in  the  code.  The  dipole  was  divided  into  2 1  segments.  Relative  percent  errors 
for  the  input  impedance,  input  resistance,  and  input  reactance,  were  computed  using  the  results 
from  the  robust  and  approximate  options  such  that 

robust  -  approximate 
robust 

Errors  remain  small  for  all  cases,  indicating  that  reasonable  accuracy  can  be  obtained  with  the  approximate 
thin-wire  option. 

Table  1.  Input  Impedance  of  a  21  Segment  Half-Wave  Dipole  for  Various  Wire  Radii 


%  error  = 


*  Segmentation  of  half-wave  dipole  reduced  to  seven  in  order  to  satisfy  A/a  ratio  for  thin- wires. 


For  most  of  the  non-adjacent  mutual  impedance  calculations,  the  approximate  technique  for  evaluating 
discussed  above  works  very  well.  However,  for  separate  wires  which  lie  close  together,  i.e.  within  a 
segment  length  A,  a  more  accurate  form  is  used.  For  these  cases,  the  MoM  code  uses  the  first  two  terms 
of  the  Maclaurin  series  expansion  given  in  (8).  That  is, 

E“  ^  [F,(l  .  jPR^  -  jp  A]  (29) 
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Accuracy  of  the  robust  and  approximate  thin-wire  options  is  illustrated  in  Table  2  where  input  impedance 
results  are  shown  for  a  half-wave  dipole,  a  three  element  Yagi-Uda  and  a  simple  tee  antenna.  The  geometry 
of  the  multiple  wire  antennas  are  illustrated  in  Figure  1  where  the  wire  radii  are  a=lxlO‘^A.  in  all  cases.  The 
results  are  compared  with  NEC3D  which  is  known  to  be  a  reliable  thin-wire  antenna  modeling  code. 


T 

0.4 

i 


Center-fed 


61  negsuSvire 


(a) 


Figure  1  -  a)  Tee  Antenna  ;  b)  Yagi-Uda  Antenna.  Note  that  the  figures  are  not  drawn  to  scale  and  all 
distances  are  in  wavelengths. 

The  difference  between  the  PSU  MoM  code  and  NEC3D  is  primarily  due  to  the  extrapolated  current 
continuity  method  used  by  the  PSU  code.  NEC3D  relies  on  a  more  mathematically  intensive  current  and 
charge  continuity  method  to  enforce  boundary  conditions  between  segments.  However,  the  differences 
between  the  two  PSU  code  options  shows  that  using  the  approximate  method  for  calculation  of  the 
generalized  exponential  integral  results  in  a  negligible  loss  in  precision. 

The  advantages  of  the  approximate  method  are  revealed,  however,  in  terms  of  matrix  fill  time.  As  an 
indication  of  the  efficiency  obtained  using  the  options  within  the  PSU  MoM  code,  execution  times  in  filling 
the  impedance  matrix  for  the  three  antenna  configurations  were  measured  and  compared  to  the  fill  time 
associated  with  NEC3D.  NEC3D  uses  a  time-consuming  adaptive  Romberg  technique  to  numerically 
evaluate  e“  using  the  form  given  in  (4).  As  shown  in  Table  3,  the  approximate  integral  calculation  has 
decreased  the  matrix  fill  time  by  about  25%  from  the  robust  formulation  with  no  significant  change  in  the 
input  impedance.  It  is  expected  that  larger  and  more  complex  structures  will  show  similar  results  since  the 
approximations  are  more  accurate  when  the  source  point  and  field  point  are  separated  by  a  greater  distance. 

In  method  of  moments,  a  critical  analysis  of  where  computational  time  is  spent  can  be  educational. 
Isolating  problem  areas  and  potential  bottlenecks  are  essential  to  increasing  computational  speed.  Here, 
we  see  that  certain  MoM  codes  may  potentially  spend  a  majority  of  time  calculating  the  generalized 
exponential  integral  to  a  needless  precision.  As  expected,  only  the  self  and  nearby  adjacent  terms  contribute 
significantly  to  the  impedance  matrix  thereby  rendering  many  highly  precise  calculations  unnecessary. 
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Table  2.  Input  Impedance  Results 


NEC3D 

®/o  Error 

(robust  vs.  approx.) 

Half  Wave  Dipole 

77.775  +j44.501 

76.830 +j43. 581 
76.837  +j43.601 

1.860  X  10'^ 

Yagi-Uda 
(3  element) 

162.44 +j569.52 

164.93  +j572.17 
164.91  +j  571.63 

8.663  X  10-' 

Tee  Antenna 

311.68  -j2848.6 

301.43  -j2807.7 
301.67-12805.0 

Table  3.  Matrix  Fill  Times 


NEC3D 

3.84  sec 

1 .05  sec 

0.77  sec 

26.6 

Yagi-Uda  (3  element) 

5.60  sec 

1.59  sec 

1 . 1 5  sec 

27.7 

Tee  Antenna 

12.25  sec 

3  .46  sec 

2.64  sec 

23.7 
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Introduction  •  u  j  i 

This  paper  describes  the  role  of  a  commercial  antenna  modeling  program  in  the  devel¬ 
opment  of  a  patent-pending  method  for  the  design  and  construction  of  multiple-frequen¬ 
cy  antennas.  Without  a  reliable  computer  model,  characterization  of  the  principle  that 
underlies  this  design  method  would  have  required  extensive  experimentation  which,  in 
this  case,  would  have  been  impractical  to  conduct.  Modeling  not  only  made  it  possible  to 
obtain  data  from  which  design  equations  could  be  derived,  but  various  nuances  and  sub¬ 
tleties  of  the  principle  have  been  identified  which  would  likely  have  remained  undiscov¬ 
ered  from  purely  empirical  data. 

The  Coupled-Resonator  Principle 

It  is  well  known  that  conductors  in  proximity  to  one  another  exhibit  strong  mutual 
coupling.  A  design  technique  called  the  Coupled-Resonator  (C-R)  principle  [1]  has  been 
developed  which  uses  this  coupling  to  advantage.  The  C-R  principle  defines  the  condi¬ 
tions  for  optimum  coupling,  creating  a  system  with  multiple  resonant  frequencies,  dri¬ 
ven  at  a  single  feedpoint.  Such  a  multiple-resonant  structure  consists  of  a  driven  dipole 
or  monopole  at  the  lowest  frequency  of  operation,  with  additional  resonant  conductors 
surrounding  it,  placed  at  the  appropriate  distances. 

Figure  1  demonstrates  the  C-R  principle  in  its  simplest  form,  a  two-frequency  system. 
A  half-wavelength  driven  dipole  is  resonant  at  some  frequency,  Fj,  and  driven  at  the 
center.  A  typical  return  loss  sweep  for  such  a  dipole  is  depicted  in  Figure  1(a).  In  Figure 
1(b),  an  additional  conductor,  half-wavelength  resonant  at  an  arbitrarily-chosen  higher 
frequency,  Fg,  is  placed  nearby.  Coupling  between  this  conductor  and  the  driven  dipole 
creates  a  return  loss  sweep,  observed  at  the  dipole  feedpoint,  which  shows  a  “bump”  at 
the  resonant  frequency  of  the  second  conductor. 

The  main  premise  of  the  Coupled-Resonator  principle  is  that  there  is  an  optimum 
spacing  between  conductors  where  the  coupling  results  in  a  matched  condition  at  F^,  as 
sketched  in  Figure  1(c).  The  effect  at  F^  is  minimal,  and  the  system  is  matched  at  both 

frequencies.  ...  , 

The  above  description  also  applies  to  systems  where  the  driven  element  is  a  monopole 
fed  against  ground,  given  the  equivalence  of  a  monopole  and  dipole.  In  this  case,  the 
feedpoint  impedance  of  a  monopole  will  be  one-half  that  of  an  equivalent  dipole. 

This  two-frequency  example  can  be  expanded  to  three,  four,  five  or  more  frequencies 
by  adding  additional  resonators  and  placing  them  radially  around  the  fed  dipole  or 
monopole,  as  shown  in  Figure  2.  A  practical  upper  limit  on  the  number  of  frequencies 
this  structure  will  support  is  reached  when  the  complexity  of  multiple  interactions 
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obscures  the  desired  coupling.  Systems  up  to  seven  frequencies  have  been  successfully 
modeled.  Five-frequency  systems  have  been  constructed  and  readily  tuned  and  matched 
at  the  desired  resonant  frequencies. 

Design  Equations 

The  variables  involved  in  the  design  of  antennas  using  the  C-R  principle  are:  conduc¬ 
tor  diameter,  conductor  spacing,  feedpoint  impedance,  and  the  ratio  of  frequencies. 
These  are  all  defined  from  the  point  of  reference  of  the  additional  frequency  under  con¬ 
sideration, 

Conductor  spacing  follows  this  general  relationship: 

Log  (d) 

-  =  .54 

Log  (D/4) 
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where  d  is  the  distance  between  conductors  and  D  is  the  diameter  of  the  conductors, 
both  expressed  in  wavelengths  at  This  approximation  is  normalized  for  a  feedpoint 
impedance  at  equal  to  that  of  the  dipole  in  free  space  (72  ohms)  or  a  monopole  over 
perfect  ground  (36  ohms),  and  for  an  F^^/F^  ratio  of  1.3  or  greater. 

The  equation  can  be  modified  to  allow  for  a  wider  range  of  impedances  and  lower  F^^/F ^ 
ratios.  Using  a  straight-line  approximation  for  impedance  and  a  first-order  curve- 
fit  for  frequency  ratio  correction,  the  original  equation  then  becomes: 


dm 


]^Q|0.54Log(D/4)]  Zq  +35.5  ^  ^  ^-[(((Fn/Fi)-l.l)xll.3H0.rj 

109 


where, 


Zq  is  the  desired  feedpoint  impedance  at  F  ,  within  the  range  of  25  to  125  ohms. 
Fj  is  the  resonant  frequency  of  the  driven  dipole 

is  the  resonant  frequency  of  the  additional  resonator 
Fj^/Fj  frequency  ratio  is  greater  than  1.1:1 
d  is  in  the  range  of  0.01  to  0.00001  wavelength 

A  significant  characteristic  is  independently  control  of  impedance  at  each  frequency, 
Fg,  Fo  ...  F  .  Adjustment  of  the  spacing,  combined  with  the  reactance  change  as  antenna 
length  is  aftered,  allows  a  wide  range  of  adjustment. 

There  are  two  additional  characteristics  that  can  be  explained  by  the  simplified  equiv¬ 
alent  circuit  shown  in  Figure  3.  At  F^^,  the  feedpoint  impedance  is  the  combination  of  Z., 
the  impedance  of  the  driven  dipole  or  monopole,  and  Z^,  the  coupled-resonator  imped¬ 
ance,  plus  Z^,  which  is  the  total  effect  of  any  other  resonators  in  the  system  (predomi¬ 
nantly  capaatance).  Compensation  for  is  readily  achieved  by  simply  lengthening  the 
resonator  (typically  0.25  to  0.5  percent)  to  add  inductance. 


Figure  2.  Pictorial  of  a  five -frequency  C-R 
antenna  in  monopole  configuration. 


Figure  3.  Simplified  equivalent  circuit  of  a 
C-R  antenna  system  at  F^. 
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Another  effect  is  an  apparent  anomaly  that  occurs  when  the  ratio  of  F^/F^  is  approxi¬ 
mately  3,  where  a  significant  increase  in  the  spacing  is  required  to  achieve  the  desired 
impedance.  This  is  readily  explained  by  noting  that  the  driven  dipole  has  a  relatively 
low  impedance  at  3/2  wavelength  (3/4  wavelength  for  a  monopole).  In  this  case,  must 
be  higher  than  normal  to  achieve  the  desired  parallel  combination  of  and  Z^,  which 
corresponds  to  a  greater  spacing  distance. 

Radiation  Characteristics 

Antennas  designed  according  to  the  C-R  principle  are  accurately  modeled  using 
method-of-moments  analysis,  including  software  based  on  either  the  Numerical 
Electromagnetics  Code  (NEC)  or  MIN1NEC3.  The  program  used  for  development  of  this 
antenna  technique  was  ELNEC  [2].  in  most  configurations,  the  directivity  (gain)  is  very 
close  to  that  of  a  simple  dipole  at  all  frequencies,  suggesting  that  radiation  is  primarily 
from  the  resonant  conductor.  Some  frequencies  exhibit  a  slight  gain  over  a  dipole,  sug¬ 
gesting  that  in-phase  current  is  present  in  the  portion  of  the  driven  dipole  which  extends 
beyond  the  active  region.  Analysis  of  the  currents  verifies  these  conclusions.  A  later  sec¬ 
tion  details  the  role  of  modeling  in  determining  antenna  behavior. 

Advantages  and  Limitations 

The  principal  advantage  of  this  antenna  design  is  the  absence  of  reactive  components, 
such  as  tuned  circuits  or  capacitively-loaded  coaxial  stubs,  which  are  often  used  to 
achieve  multi-frequency  operation.  These  components  may  introduce  losses,  or  require 
time-consuming  tuning  adjustment.  The  C-R  antenna  design  achieves  its  performance 
by  controlling  the  physical  dimensions  of  conductor  length,  diameter  and  spacing. 

Another  significant  advantage  is  that  the  feedpoint  impedance  at  each  additional  fre¬ 
quency  can  be  controlled  by  adjustment  of  resonator  spacing  and  length.  For  example, 
when  a  C-R  antenna  element  is  placed  in  an  array,  the  driving  point  impedances  can  be 
significantly  different  at  each  operating  frequency.  The  C-R  principle  allows  each  fre¬ 
quency’s  resonator  to  be  adjusted  over  a  useful  range  of  resistance  and  reactance. 

Two  limitations  should  be  noted.  First,  the  tradeoff  for  electrical  simplicity  is  a  rela¬ 
tively  complex  mechanical  assembly.  The  structure  must  support  a  central  dipole  or 
monopole  and  maintain  spacing  with  the  additional  resonators  with  insulators  or  other 
means.  However,  it  should  be  noted  that  other  multi-frequency  configurations  also  have 
special  construction  requirements.  The  other  limitation  of  the  C-R  method  is  a  reduction 
in  VSWR  bandwidth  at  F2,  Fg  and  higher  frequencies  of  operation,  compared  to  a  simple 
dipole  or  monopole.  This  shortcoming  can  be  mitigated  by  the  use  of  large-diameter  con¬ 
ductors,  or  in  extreme  cases,  additional  resonators  with  overlapping  coverage.  Again, 
other  common  multi-frequency  antenna  designs  also  exhibit  reduced  bandwidth. 

The  Role  of  Computer  Modeling  in  Development 

While  investigating  various  configurations  of  multiband  antennas  for  possible  ama¬ 
teur  radio  use,  the  open-sleeve  antenna  was  evaluated.  The  open-sleeve  is  a  derivative 
of  the  coaxial  dipole,  in  which  a  simple  dipole  or  monopole  is  enclosed  by  a  coaxial  sleeve 
that  is  approximately  one-half  the  length  of  the  driven  element,  and  resonant  at  about 
twice  the  frequency.  The  open  sleeve  reduces  the  configuration  to  two  conductors  that 
represent  a  “skeleton”  of  the  original  coaxial  sleeve.  This  antenna  is  well  known,  having 
been  developed  in  the  1940s  and  included  in  several  major  reference  texts  [3]. 

Descriptions  of  these  antennas  referred  to  the  coaxial  or  open-sleeve  section  as  a 
transmission  line  transformer.  An  intuitive  conclusion  was  made  that  the  current  distri- 
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bution  and  radiation  patterns  calculated  by  ELNEC  were  not  consistent  with  a  trans¬ 
former  model.  Once  this  conclusion  was  reached,  it  was  a  logical  step  to  assume  that 
two  conductors  were  unnecessary,  since  the  simulation  of  a  coaxial  line  was  not 
required,  hence,  the  concept  of  a  single  additional  conductor  for  each  new  frequency  of 
coverage  (see  the  sequence  illustrated  in  Figure  4). 

Without  the  restrictions  of  a  simulated  coaxial  line,  the  structure  of  a  antenna  using  a 
coupled  resonator  element  becomes  much  simpler.  The  first  expansion  of  the  concept 
was  to  evaluate  an  antenna  with  more  than  one  additional  resonator.  A  three-frequency 
design  was  modeled  successfully,  with  current  distribution  and  radiation  pattern  ana¬ 
lyzed  at  each  frequency.  This  model  confirmed  that  each  additional  resonator  operated 
independently  —  maximum  current  occurred  in  whichever  conductor  was  resonant,  and 
the  radiation  pattern  was  dipole-like  at  each  frequency.  More  complex  models  of  four, 
five,  six  and  seven  frequency  antennas  further  reinforced  the  original  concept,  while 
uncovering  new  performance  characteristics. 

Construction  of  Trial  Antennas  .  u  j 

Before  continuing,  validation  of  the  accuracy  of  the  model  was  deemed  highly  desir¬ 
able.  At  this  point,  all  work  on  the  C-R  concept  had  been  performed  on  the  computer.  It 

was  time  to  build  some  antennas.  i  i 

The  first  antenna  was  constructed  one  step  at  a  time.  A  dipole  for  the  14  MHz  ama¬ 
teur  band  was  built  from  aluminum  tubing,  tapering  from  1-1/4”  to  5/8”  diameter.  An 
average  uniform  diameter  of  approximately  1”  was  assumed  for  the  computer 
Next,  a  conductor  resonant  in  the  21  MHz  band  was  placed  in  parallel  with  the  14  MHz 
dipole  at  a  distance  of  7”  o.c.  Plastic  spacers  were  used  to  maintain  uniform  spacing.  As 
predicted,  the  21  MHz  resonance  was  seen  at  the  feedpoint,  and  the  resonance  (non¬ 
reactive  feedpoint  impedance)  of  the  14  MHz  moved  slightly  higher  in  frequency  The 
capacitance  introduced  by  the  new  conductor  explains  the  frequency  shift.  Finally,  a 
third  conductor  resonant  in  the  28  MHz  band  was  added  in  the  same  manner,  placed  on 
the  opposite  side  of  the  14  MHz  fed  dipole  for  maximum  isolation  from  the  21  MHz  res- 
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onator.  The  new  resonance  appeared  at  the  feedpoint  as  expected,  with  another  small 
shift  in  the  resonant  frequency  at  14  MHz. 

An  additional  characteristic  was  evaluated  at  this  time  —  VSWR  bandwidth.  The 
model  had  predicted  that  this  system  would  have  a  narrower  bandwidth  between  2:1 
VSWR  points  at  both  21  and  28  MHz,  compared  to  a  simple  dipole  at  each  frequency, 
the  highest  frequency  would  have  the  most  pronounced  narrowing.  The  measured  band¬ 
width  at  28  MHz  was  400  kHz,  about  10-15  percent  greater  than  predicted,  a  reasonable 
accuracy  for  a  modest  experiment. 

On-the-air  amateur  radio  contacts  were  made  using  this  antenna,  and  although  a  cal¬ 
ibrated  comparison  antenna  was  not  available,  based  on  the  author’s  experience,  perfor¬ 
mance  was  within  the  expected  range  for  a  dipole. 

An  additional  antenna  completed  the  early  tests  to  confirm  the  validity  of  the  models. 
A  dipole  for  the  18  MHz  and  24  MHz  amateur  bands  was  constructed  from  #12  copper 
wire,  with  a  series  of  plastic  spaces  used  to  maintain  the  computer-modeled  2”  spacing. 
Performance  was  obtained  as  predicted,  with  one  behavior  of  note.  The  VSWR  at  24 
MHz  varied  over  a  wide  range  versus  the  height  above  ground.  Although  this  is  a  well- 
known  phenomenon,  the  degree  of  variation  was  greater  than  expected,  and  greater 
than  observed  with  the  earlier  antenna.  Evaluating  this  characteristic  using  the  com¬ 
puter  model,  the  variation  was  confirmed.  Apparently,  the  combination  of  closer  spacing 
and  tighter  physical  tolerances  required  for  the  small-diameter  wire  conductors  is  the 
cause.  The  lower  Q  of  the  aluminum  tubing  construction  minimized  these  effects. 

Development  of  Design  Equations 

Having  confirmed  the  validity  of  the  ELNEC  model  for  antennas  using  the  Coupled- 
Resonator  principle,  a  methodical  search  was  undertaken  to  establish  design  equations. 
After  some  very  preliminary  calculations  to  establish  the  likely  nature  of  the  equation, 
the  following  parameters  were  evaluated  for  the  simplest  two-frequency  case: 

•  Required  spacing  versus  conductor  diameter,  with  a  fixed  2:1  frequency  ratio. 

•  Spacing  versus  frequency  ratio,  with  fixed  conductor  diameter 

•  Spacing  versus  resistive  component  of  impedance,  with  fixed  conductor  diameter 

•  Determine  appropriate  frame  of  reference 

First,  models  were  created  and  iterations  performed  to  gather  data  to  tabulate  the 
required  spacing  for  element  diameters  of  0.00001,  0.0001,  0.001  and  0.01  wavelength 
diameters.  The  dimensions  for  both  diameter  and  spacing  were  referenced  to  the  addi¬ 
tional  frequency,  and  for  an  impedance  equal  to  a  dipole  in  free  space,  or  72  ohms. 
The  data  were  only  collected  for  the  case  where  both  the  driven  dipole/monopole  and  the 
additional  resonator  were  the  same  diameter.  When  plotted,  the  data  made  a  nearly 
straight-line  fit  on  a  log-log  scale.  Weighting  the  data  in  favor  of  the  spacing  distance 
resulted  in  a  a  more  accurate  log-log  curve  fit,  resulting  in  the  basic  equation: 

Log  d 

-  =  .54  (1) 

Log  {D/4) 

where,  d  is  the  center-to-center  spacing  distance  and  D  is  the  element  diameter,  both 
expressed  in  wavelengths  at  F^^. 

Next,  data  were  collected  to  determine  the  deviation  from  the  above  equation  for  fre- 
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quency  ratios  other  than  2:1.  Diameters  were  held  constant  while  the  frequency  (and 
element  length,  of  course)  was  varied  from  a  ratio  F^^/F  of  1.1  to  5.0.  The  data  were  col¬ 
lected  using  both  0.01  and  0.0001  wavelength  diameters.  Above  a  ratio  of  1.4,  the  spac¬ 
ing  was  essentially  the  same  as  required  for  the  2:1  ratio  of  (1).  From  a  ratio  of  1.4  down 
to  1.1,  the  required  spacing  increased  in  a  curve  that  appeared  to  be  classic  1/e'^  func¬ 
tion.  A  first-order  fit  to  this  curve  provides  the  following  correction  factor  for  (1): 

1  +  (2) 

At  this  point,  a  special  case  should  be  noted.  The  required  spacing  deviates  from  a 
regular  function  at  a  F  /Fj  ratio  of  3.  At  this  point,  the  impedance  of  the  driven  dipole  is 
relatively  low,  being  3/2  X.  The  impedance  of  the  coupled-resonator  element  must  be 
higher  than  usual  for  the  parallel  combination  to  equal  the  desired  72  ohms.  This  is  an 
anomalous  situation  and  is  not  reflected  in  the  design  equations,  which  are  believed  to 
accurately  describe  the  impedance  contributed  by  the  additional  resonator. 

Finally,  holding  the  frequency  ratio  constant  at  2,  the  variation  in  spacing  required  to 
obtain  a  resistive  component  of  impedance  other  than  72  ohms  was  determined.  This 
data  was  plotted,  resulting  in  a  shallow  curve  on  a  linear  scale.  Over  the  range  of  25  to 
beyond  120  ohms,  the  following  linear  expression  is  accurate  to  within  5  to  10  percent: 


Zq  35.5 
109 


(3) 


Applying  the  corrections  of  (2)  and  (3)  to  equation  (1)  gives  the  basic  design  equation 
for  a  Coupled-Resonator  element: 


dj^  =  10  10-54  Log  (D/4)]  X 


Zq  +  35.5 


[1  -I-  e“t0(Fn/Fl)-  l.Dx  11.3) +  0.11 


This  equation  is  not  presented  as  a  rigorous  description,  rather  it  serves  as  a  basis  for 
design  and  optimization.  In  general,  the  spacings  determined  using  this  equation  are 
accurate  within  5  to  10  percent,  even  in  systems  of  five  or  more  frequencies.  Also,  (4) 
does  not  describe  systems  with  unequal  diameter  elements.  An  approximation  using  the 
mean  diameter  of  the  driven  and  additional  elements  is  an  adequate  starting  point  if 
the  variation  in  diameters  is  modest.  Optimization  will  then  complete  the  task  of  deter¬ 
mining  the  proper  spacing. 

The  design  equation  of  (4)  also  does  not  include  element  length.  In  general,  lengths 
are  similar  to  those  required  for  half-wavelength  dipoles  of  the  same  diameter,  as  noted 
in  all  major  antenna  reference  texts.  However,  the  exact  length  involves  two  additional 
variables  —  the  effects  of  all  additional  conductors  (capacitance),  and  the  parallel  com¬ 
bination  of  the  driven  dipole  impedance  and  the  Coupled-Resonator  impedance.  In  prac¬ 
tice,  the  variation  in  length  is  less  than  one  percent  from  a  normal  dipole  length  in  the 
majority  of  configurations.  Greatest  variations  have  been  found  with  extreme  F  j/F^ 
ratios  (e.g.  greater  than  5),  and  in  systems  with  large  numbers  of  additional  resonators 
and  a  relatively  small  maximum  F j/F^  (2.5  or  less). 

Additional  refinement  of  the  design  equation  would  be  desirable,  and  suggestions 
have  been  made  at  a  few  engineering  schools  that  this  might  be  a  good  project  for  a 
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graduate-level  student.  The  main  conclusion  that  can  be  drawn  from  the  work  done  so 
far  is  this  —  It  appears  clear  that  the  derivation  of  an  accurate  and  complete  design 
equation  is  a  straightforward  task.  What  remains  to  be  done  is  the  characterization  of 
the  interactions  of  the  driven,  resonant  and  non-resonant  conductors  (primarily  capaci¬ 
tance)  and  the  inclusion  of  the  impedance  of  the  driven  element,  which  is  seen  in  paral¬ 
lel  with  the  impedance  of  the  resonant  coupled  element. 

Conclusions 

The  ability  to  simulate  this  antenna  configuration  on  a  computer  was  essential  to  its 
development.  The  MININEC-based  ELNEC  program  was  used,  and  its  accuracy  was 
verified  by  the  construction  of  several  test  antennas  using  various  numbers  and  sizes  of 
conductors.  Once  validity  of  the  model  was  established,  a  large  number  of  iterations 
could  then  be  performed  to  collect  data  to  establish  the  basic  relationships  among  physi¬ 
cal  dimensions,  resulting  in  a  useful  design  equation  which  has  a  minimum  of  con¬ 
straints. 
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Appendix 

Notes  on  the  Patent  Application  and  Prior  Art 

Like  all  “discoveries,”  the  Coupled-Resonator  principle  described  in  this  paper  is 
based  in  part  on  the  work  of  others,  having  been  initiated  by  an  examination  of  the 
open-sleeve  antenna.  In  addition,  several  specific  antenna  configurations  are  known 
which  incorporate  some  characteristics  described  here  and  in  the  referenced  patent 
application. 

The  intended  contribution  of  this  discussion  is  that  the  Coupled-Resonator  con¬ 
cept  be  recognized  as  a  basic  antenna  design  principle  that  can  be  widely  applied.  In 
this  light,  all  parasitic  antennas,  such  as  Yagi-Uda  arrays  and  the  coaxial  and  open- 
sleeve  antennas,  are  specific  applications  of  this  fundamental  principle.  The  scope 
of  the  patent  application  is  the  use  of  this  principle  to  create  dipole  and  monopole 
antenna  elements  which  cover  up  to  seven,  and  possibly  more,  frequencies. 

It  is  hoped  that  further  development  of  the  work  begun  here  will  result  in 
improved  understanding  of  the  behavior  of  conductors  in  proximity.  The  analysis 
tools  are  in  place  to  accurately  model  the  behavior  of  various  configurations,  but 
work  remains  to  be  done  to  develop  complete  synthesis  tools  that  implement  specific 
designs  using  the  C-R  relationships. 
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ANTENNA  DESIGN  AND  DEVELOPMENT  USING  NEC- WIN 
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NEC-WIN  is  a  graphically  oriented  antenna  design  and  optimization  program  based  on  the  Numerical 
Electromagnetics  Code  core.  NEC-WIN  provides  a  wide  range  of  powerful  features  wliich  significantly  enhance 
the  ability  of  the  antenna  designer  to  quickly  and  efficiently  analyze  even  the  most  complex  antenna  structures. 
The  program  is  based  in  Windows  and  it  has  taken  advantage  of  the  graphical  nature  of  Windows  to  provide  an 
enhanced  interface  for  the  NEC  commands.  While  users  will  still  be  able  to  enter  NEC  commands  as  they  have 
done  previously,  they  will  also  be  able  to  access  graphical  assisted  commands  which  prompt  the  user  for  tiie 
required  data. 

User’s  efficiency  is  further  enhanced  using  the  three  dimensional  viewing  capability  of  the  program. 
Using  NEC-VU.  the  designer  has  instant  access  to  a  grapliical  representation  of  the  structure  during  the  data 
entry  process,  NEC-VU  can  be  used  to  visualize  the  antenna  using  rotate,  zoom  and  pan  features.  Furthermore, 
the  designer  has  the  ability  to  analyze  the  structure  in  order  to  verify  wire  connections  and  placement.  This 
process  is  further  enhanced  by  using  NEC-VU’s  configuration  capability  to  highlight  non-connected  wires. 
Once  a  wire  without  a  connection  is  found,  the  user  can  access  the  edit  mode  and  use  a  cursor  to  highlight  the 
wire.  NEC-VU  indicates  which  wire  number  is  highlighted  and  where  the  wire  was  created  in  the  input  file. 

After  succc.ssfully  creating  the  input  tile,  the  user  can  process  the  file  within  NEC-WIN  as  a  standard 
NEC  file  or  with  optimization  capability,  depending  on  the  lunctions  that  were  selected  in  the  input  file.  NEC- 
WIN  provides  a  comprehensive  plotting  capability  for  visualizing  the  results.  The  user  can  select  line  type,  tine 
width  and  line  color  for  vaiious  sets  of  data  to  be  plotted.  In  addition,  the  user  can  overlay  various  results  on  a 
single  plot  and  incorporate  and  overlay  results  from  a  previous  antenna  tile.  Full  control  of  legends,  titles,  line 
types  and  widths  for  the  plot  are  provided. 

This  paper  will  show  how  NEC-WIN  enables  the  user  to  enter  data  using  grapliical  assistance  and  the 
viewing  capabilities  of  NECVU.  Following  data  entry,  the  paper  will  detail  how  to  run  NEC  and  gain  Uie 
results  of  the  analysis  using  the  plotting  ability  of  NEC-WIN. 

NEC-WIN  DATA  ENTRY: 

One  of  the  most  difllcull  problems  related  to  NEC  based  codes  has  been  the  user  interface.  Tlie  typical 
manner  a  user  directs  NEC  to  process  an  input  tile  is  for  that  user  to  create  an  input  file  using  a  specific  .set  of 
commands  and  syntax  with  a  DOS  text  editor.  This  is  a  cumbersome  technique  and  it  has  lead  to  significant 
frustrations  by  many  users.  NEC-WIN  is  designed  to  provide  users  with  a  graphical  interface  in  the  Windows 
environment  such  that  they  do  not  have  to  experience  the  typical  Irusiration  when  creating  a  NEC  input  file. 
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NEC-WfN’s  editor  has  been  optimized  for  different  experience  levels  that  an  antenna  designer  may 
have.  In  the  “expert”  mode,  NHW-WIN  provides  a  text  editor  where  users  familiar  wiili  the  NHC  commands  can 
enter  the  information  directly.  NBC-WIN  includes  commands  such  as  cut,  copy,  paste,  undo,  redo,  find  and 
replace  with  the  text  editor  lo  support  data  entry.  These  commands  become  very  useful  when  working  with 
large  input  files  or  during  the  editing  process  of  multiple  input  files  which  are  simultaneously  open  on  the 
screen. 


Tlie  second  mode  for  data  entry  with  NEC-WIN  is  designed  for  users  that  have  some  knowledge  of 
NEC  commands,  but  are  not  familitir  enough  with  the  commands  to  know  all  of  the  data  that  must  be  entered 
for  a  particular  command.  Figure  1  is  a  display  of  a  typical  screen  for  NEC-WIN  in  this  mode.  Under  the  NEC- 
WIN  main  menu  bar  on  the  left  side  of  the  screen  is  the  editor  box  that  contains  the  text  for  an  input  file.  On  Uie 
right  side  of  the  .screen  is  a  scrollable  menu  that  provides  a  list  of  the  NEC  commands  along  with  a  short 
statement  as  to  each  commands  use.  As  an  example,  if  the  user  wants  to  command  NEC  to  create  a  radiation 
pattern,  the  user  would  scroll  through  the  caid  list  until  he  reached  RP:  Radiation  Pattern  Request.  By  selecting 
tliis  card,  the  user  will  automatically  have  an  “RP”  placed  in  the  text  file  and  he  will  be  vectored  lo  the 
Radiation  Pattern  input  screen  as  seen  in  Figure  I. 
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Figure  1:  NFC-WIN  Data  Input  .Screen 
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The  Radial  ion  l»altern  screen  in  Figure  1  lypifies  ihe  graphical  nature  of  the  displays  used  for  data 
entry.  These  screens  are  based  on  providing  a  consolidated  list  of  inputs  that  are  required  for  NFX  to  create  the 
appropriate  text  line.  For  the  RP  card,  the  user  will  enter  the  start  and  stop  points  for  theta  and  phi  along  with 
the  number  of  steps  used  for  proce.ssing.  The  remaining  inputs  associated  with  the  RP  card  are  set  to  a  detault 
condition.  The  user  can  enter  the  basic  data  and  accept  the  default  information  to  quickly  create  the  card  or  one 
can  access  the  defaults  and  change  these  to  the  appropriate  settings.  Dy  creating  defaults,  the  input  ol  data  is 
more  intuitive  and  the  process  is  significantly  faster. 

Tlie  other  mode  for  data  entry  with  NEC-WIN  is  the  ‘’novice  mode  .  Tiiis  mode  is  designed  tor  users 
that  have  no  knowledge  of  NEC.  but  they  understand  how  to  create  a  wire  model  of  an  antenna.  When  the  user 
enters  the  novice  mode,  Utree  buttons  (Comments,  Geometry,  Output)  are  available  for  the  user  to  enter  data. 
The  key  area  for  the  novice  is  the  geometry  entry  section.  In  this  area,  the  user  will  enter  the  start  and  stop 
points  of  the  wire  in  an  “Fixcel-like”  worksheet.  In  addition  to  entering  the  wire  end-point  intormation,  NEC- 
WIN  has  graphical-based  screens  for  scaling,  rotation,  or  translation  of  a  particular  wire.  Tlie  u.ser  can  add 
sources  and  loads  to  wires  within  the  geometry  input  screen,  Tltc  key  to  the  data  entry  process  in  the  novice 
mode  is  that  it  can  be  done  in  any  order  and  the  user  does  not  need  to  be  aware  ol  how  a  NEC  input  file  is 
constructed. 


VIEWING  AN  ANTENNA  USING  NEC-VU: 


NEC-VU  provides  a  fast,  easy,  efficient  way  to  view  the  segment  geometry  of  a  NEC  antenna  structure. 
Tire  program  is  accessed  from  the  main  control  panel  of  NEC-WIN  by  clicking  on  the  “eye”  icon.  NEC-VU 
analyzes  the  NEC  input  file  that  the  user  is  developing  in  order  to  provide  a  three-dimensional  representation  ol 
the  antenna  geometry  on  the  screen.  A  sample  ot  a  NEC-VU  display  screen  tor  a  very  complex  structure  is 
shown  in  Figure  2.  After  the  object  is  displayed  on  the  screen,  the  user  can  analyze  the  structure  using  the 
mouse  to  perform  rotation,  panning  and  zooming.  The  ability  to  perform  these  operations  in  a  continuous  real¬ 
time  mode  with  no  “dicker”  is  where  the  power  of  NEC-VU  is  evident. 

NEC-VU  was  developed  in  Assembly  language  in  order  to  maximize  the  rotation  speed  of  the  antenna. 
By  paying  close  attention  to  the  vertical  refresh  rate  of  the  screen  and  optimizing  the  code  such  that  all  of  the 
updates  can  be  done  in  a  compressed  time  frame,  NEC-VU  has  Ihe  ability  to  rotate  the  antenna  structure  in  a 
manner  that  is  typically  .seen  only  on  workstations.  Furthermore.  NEC-VU  maintains  the  same  ability  to 
provide  continuous  rotation  r)n  input  files  that  contain  over  4000  segments. 

The  key  application  of  NEC-VU  is  for  the  visualization  of  the  antenna  structure  and  analysis  ol  tlie 
structure  to  ensure  tliat  the  input  file  has  been  generated  conectly.  NEC-WIN  enables  the  user  to  access  a 
configuration  file  where  one  can  define  different  line  types,  such  as  free  ends,  segments,  and  junctions  to  have 
different  colors.  This  makes  visual  verification  of  the  antenna  much  easier  in  that  free  ends,  for  example,  can  be 
defined  as  a  different  color  than  the  rest  of  the  segments  .such  that  they  will  be  easy  to  identity  during  the 
analysis.  NECVU  also  supports  an  editing  mode  whereby  the  user  can  move  the  cursor  to  a  segment  on  the 
display  and  have  NECVU  indicate  what  wire  number  corresponds  to  this  point  as  well  as  the  locaUon  in  the 
input  file  where  that  wire  was  created. 
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DD963  file  "c)d963arr  .  inp "  from  file  "  1  .  inp 
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PROCESSING  THE  AN'I'ENNA  STRUCTURE: 


NFW-WIN  processes  anieniia  files  using  NhC-OFF,  wliicli  is  based  on  the  NHC2  core.  NEC-OI  T 
consists  of  a  quasi-Newton  optimizer  integrated  with  a  double  precision  version  of  NHC2  and  the  required 
interfaces  to  provide  the  necessary  communication  between  the  two  sections.  Both  constrained  and 
unconstrained  oplinii/alion  is  provided  by  ihc  opiiraizer.  The  user  can  provide  the  necessary  conimands  to 
control  the  design  goals,  vaiiablcs  and  optimization  parameters.  In  addition  to  optimization,  the  NhL-Gl 
package  has  the  ability  to  sequence  any  variable  defined  in  the  NEC  input  file  over  a  specified  range  ot  values. 
Tliis  is  useful  for  developing  design  curves  or  performing  a  worst  case  analysis. 

NEC-WIN  contains  an  extended  command  set  beyond  the  standard  NEC  commands  in  order  to  support 
users  that  desire  optimization.  The  power  of  using  the  NEC-OPT  core  is  apparent  when  one  establishes  pals 
for  certain  parameters  that  ilie  user  wants  to  optimize.  Four  different  NEC  output  types  may  be  sampled  and 
processed  as  one  changes  the  geometry  in  an  effort  to  reach  these  goals.  The  specific  outpui.s  that  can  be 
optimized  arc  far- field  patterns,  near-field  patterns,  source  impedance,  and  segment  currents.  For  each  gop 
there  are  many  options  which  select  specific  parts  of  the  desired  NEC  output  data  or  define  the  processing  to  be 
performed  on  it.  The  many  options  provide  a  generic  and  versatile  interlace  to  nearly  every  type  ol  NEC  oupul 
data.  High  level  characteristics  such  as  gain,  pattern  beamwidhi,  VSWK,  front-to-back  ratio  and  many  others 
may  be  chosen  for  optimization  and  output  processing. 

Multiple  goals  may  be  specified  for  the  same  run  of  NEC-01^.  Each  goal  may  be  separately  weighted 
to  allow  the  user  to  balance  the  significance  of  each  goal  to  meet  the  specific  needs  ol  the  problcim  At  the 
conclusion  of  the  processing,  NEC-OET  creates  a  file  which  contains  the  results  of  the  optimization.  The  user 
can  view  this  information  and  alter  the  input  data  set  to  correspond  to  the  optimized  structure. 


VIEWING  RESULTS  WITH  NEC-WIN: 

While  many  of  tlie  problems  as.sociated  with  NEC  based  codes  liave  been  the  user  interlace,  another 
glaring  weakne.ss  of  the  codes  is  the  ability  to  look  at  the  results  without  having  to  export  data  to  alternate 
packages  for  processing  and  plotting.  NEC-WIN  has  been  developed  with  an  extensive  plotting  package  that 
provides  unmatched  Ilexibility  for  creating  plots  of  the  results  obtained  after  running  NhC-01^.  Tlie  package 
provides  the  user  with  a  wide  variety  of  commands  that  enables  one  to  customize  the  output  tor  their  panicular 
requirements  and  save  these  settings  as  a  series  ol  macios  loi  latei  usage. 

Access  to  pattern  plots  is  available  from  the  main  NEC-WIN  menu  by  clicking  on  the  “pattern”  icon. 
This  commands  NEC-WIN  to  examine  the  current  file  and  determine  what  pattern  plots  are  available. 
Following  the  processing,  NHC-WIN  presents  a  display  as  seen  in  Figure  3.  On  the  left  side  ol  the  screen  is  a 
summary  of  the  available  patterns.  The  table  contains  each  patterns  type  (azimuth  or  elevation),  the  range  ol 
theta  and  phi,  along  with  the  frequency  and  the  file  name  where  the  pattern  is  contained.  On  the  right  side  ol  the 
screen  are  controls  for  defining  the  horizontal  gain,  vertical  gain  and  total  gain.  For  each  gam  type,  one  can 
define  the  type  of  line,  line  color  and  line  width  corresponding  to  the  selected  pattern  in  the  table.  In  addition 
the  user  the  ability  look  ai  the  output  files  from  previous  processing  runs  and  combine  the  available  patterns 
such  that  one  can  do  overlays. 
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Figure  3:  Radiation  Pattern  Configuration  Screen 


After  the  plots  have  been  configured,  the  user  can  generate  a  graph  after  adjusting  plot  parameters  in  the 
azimuth  and  elevation  plot  control  screens.  When  ttiis  is  complete,  a  graph  as  seen  in  Figure  4  is  generated.  Tlie 
user  has  complete  control  over  the  look  and  feel  of  this  graph  in  that  one  can  change  the  font  style,  size,  and 
type  for  all  labels,  titles  and  mtirkings  on  the  graph.  In  addition,  a  title  of  an  tirbitrary  length  can  be  added  and 
the  program  will  automatically  scale  the  graph  to  fit  on  the  .screen  or  the  output  plot.  In  addition,  the  user  can 
add  a  legend  with  customized  information  on  each  line  of  the  legend  along  witti  the  ability  to  add  a  title  and  a 
footer.  Furthermore,  the  position  of  the  legend  and  title  can  be  adjusted  to  a  variety  of  locations. 


Plots  are  available  to  the  screen  as  w'ell  as  to  a  printer.  Tlie  user  can  also  output  tabular  data 
corresponding  to  ttie  graph.  One  of  the  most  interesting  features  ol  NFiC-WI.M  is  the  ability  to  copy  the  entire 
plot  or  a  section  ol  the  plot  to  a  clipboard  and  move  to  a  word  processor  where  the  information  can  be  pasted 
into  a  document.  NFX’-WIN  also  allow's  the  user  to  create  a  three  dimensional  surface  plot  of  the  output  as  seen 
in  Figure  5.  This  pattern  can  be  examined  in  the  same  way  that  a  user  would  manipulate  the  antenna  using 
NFC-VU.  Using  NHC-SURF.  the  user  can  access  rotate,  pan  and  zoom  functions  to  gain  a  better  understanding 
of  the  antennas  performance. 
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NEC-WIN  Sample  Plot 


Figure  4;  Sample  Output  of  NEC-WIN  Plot  Program 
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Figure  5:  NFX-SURF  Outpui  ol  Anicnna  Fatterii 


CONCLUSION: 


NHC-WIN  is  a  powcrlul  antenna  design  and  analysis  program  that  provides  users  with  an  unmatched 
level  of  performance  and  options.  Tlie  program  uses  a  modified  version  of  NF;C2  to  include  optimization 
capabilities  as  ptirt  of  the  core  processing.  Using  NFX"-WIN,  the  antenna  designer  has  graphical  based  input  and 
tliree-dimensional  viewing  of  the  structure  to  assist  in  developing  the  input  geometry.  After  the  file  has  been 
processed,  the  user  can  plot  various  patterns,  obtain  a  tabular  output  of  the  data  or  view  that  antenna  pattern  in  a 
three-dimensional  surface  plot. 
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THE  "PAINT  "  SYSTEM 

A  UTD/NEC  HYBRID  PACKAGE  FOR  SIMULATING  ANTENNA  PATTERNS 
OVER  3-DIMENSIONAL  IRREGULAR  TERRAIN 
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Department  of  Electrical  Engineering 
The  Pennsylvania  State  University 
University  Park,  PA  16802 


Abstract 

This  paper  describes  a  software  package  (PAINT)  which  utilizes  the  uniform  theorj'  of  diffraction  (UTD)  in 
conjunction  with  the  numerical  electromagnetics  code  (NEC)  to  predict  radiation  patterns  of  antennas  situated  in  3- 
dimensional  (3D)  irregular  terrain  at  high  frequencies  (HE).  The  PAINT  system  is  a  user  friendly  integrated  software 
package  which  performs  all  the  tasks  necessary  to  predict  antenna  patterns  over  3D  irregular  terrain.  The  software  package 
is  self  contained  and  runs  on  a  personal  computer  (PC)  platform.  The  package  can  utilize  existing  terrain  databases  to 
generate  3D  terrain  models.  The  models  are  analyzed  with  an  existing  UTD  program,  and  the  results  can  be  processe 
and  displayed  by  other  routines  in  the  integrated  software  package.  The  3D  modeling  capability  allows  azimuth  as  well 
as  elevation  patterns  to  be  simulated.  Primary  validation  is  accomplished  by  comparing  to  field  measurements  of  elevation 
patterns.  In  addition,  the  antenna  performance  of  an  existing  HF  receiving  facility  is  analyzed,  and  the  results  are 
compared  to  predictions  from  NEC  utilizing  a  flat  earth  model, 

1.  Introduction 

The  presents  of  an  irregular  foreground  can  have  a  significant  effect  on  the  performance  of  HF  antennas  [1]. 
It  is  desirable  to  be  able  to  simulate  these  effects  as  part  of  the  evaluation  process  before  going  through  the  expense  of 
constructing  and  measuring  the  antenna.  This  work  is  part  of  an  on-going  research  effort  sponsored  by  the  Navy  and 
conducted  at  the  Applied  Research  Laboratory  of  the  Pennsylvania  State  University  to  study  HF  commumcations[2j. 

PAINT  is  an  acronym  that  stands  for  Performance  of  Antennas  In  Non-ideal  Terrain.  As  the  name  implies  the 
system  is  designed  to  simulate  the  perturbations  to  antenna  patterns  caused  by  locating  the  antenna  over  3D  irregular 
terrain  The  PAINT  modeling  system  is  an  integrated  collection  of  software  programs  written  in  Fortran  to  run  on  a  PC 
platform  using  the  MS-DOS  operating  system.  The  PAINT  system  is  designed  to  isolate  the  user  from  unnecessary'  details 
of  the  simulation  process.  The  PAINT  system  makes  extensive  use  of  menus  and  graphical  displays  which  allows  an 
unfamiliar  user  to  generate  and  analyze  complex  models  with  only  minimal  training. 

The  PAINT  system  can  directly  access  various  sources  of  terrain  data  such  as  the  United  States  Geological  Survey 
(USGS)  Digital  Elevation  Model  (DEM)  terrain  data,  and  the  Defense  Mapping  Agency’s  (DMA)  Digital  Terrain  Elevation 
Data  (DTED),  Graphical  data  displays  enable  the  user  to  quickly  select  and  process  millions  of  points  of  terrain  data  into 
a  usable  terrain  model.  Complex  antennas  used  in  the  model  may  be  defined  by  the  user,  or  they  may  be  taken  directly 
from  the  output  of  the  NEC  program.  The  UTD  analysis  of  the  mode!  is  performed  by  the  NECBSC  program  developed 
for  the  Navy  by  Ohio  State  UniversitylS].  The  output  patterns  may  be  displayed  in  polar  form  on  the  screen  or  sent  to 
a  hard  copy  device. 

2,  The  Modeling  Process 

The  PAINT  modeling  system  is  an  integrated  environment  that  provides  all  the  routines  necessary'  to  simulate  the 
pattern  of  an  antenna  situated  in  irregular  terrain.  The  initial  simulation  process  consists  of  a  sequence  of  steps  from  site 
selection  through  simulated  output  plotting.  Each  step  may  consist  of  executing  one  or  more  PAINT  commands.  This 
section  provides  a  basic  overview'  of  the  modeling  process  from  start  to  finish.  Figure  1  shows  the  basic  steps  ot  the 
modeling  process  and  the  two  letter  commands  of  the  PAINT  modeling  system  that  perform  each  function. 

The  first  step  in  the  modeling  process  is  site  selection.  The  goal  of  site  selection  is  to  select  a  small  region  o 
terrain  around  the  area  where  the  antenna  is  to  be  located.  The  region  should  be  as  small  as  possible  while  still  containing 
all  major  structures  which  effect  the  antenna’s  performance.  For  instance,  if  the  antenna  is  located  m  a  valley  between 
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two  hills  the  selected  terrain  site  should  be  large  enough  to 
contain  the  two  hills  nearest  the  antenna.  Three  different 
sources  of  terrain  data  may  be  accessed  by  the  PAINT 
system.  USGS  DEM  data.  Level  1  DMA  DTED  data  on 
CD-ROM,  and  X,Y,Z  triplets  contained  in  a  text  file  may  all 
serve  as  terrain  data  inputs  for  the  PAINT  system.  Each  of 
the  USGS  and  DTED  data  sets  may  contain  over  1 .4  million 
points  of  terrain  data.  To  effectively  handle  the  enormous 
amount  of  terrain  data,  the  user  performs  a  2-stage  graphical 
selection  process  to  determine  a  smaller  region  of  interest. 

After  the  site  terrain  data  has  been  chosen  it  can  be  further 
simplified,  if  desired,  and  converted  to  the  flat  plate  structure 
which  is  used  in  the  NECBSC  model.  The  plates  used  in  the 
mode!  may  be  given  lossy  dielectric  parameters  to  more 
accurately  simulate  real  ground. 

The  next  step  is  to  define  the  antennas  used  in  the 
simulation.  The  user  may  define  multiple  wire  antennas  w'ith 
various  orientations,  or  if  complex  antennas  have  been 
previously  modeled  with  NEC,  they  may  be  taken  directly 
from  the  NEC  output  file.  By  using  the  currents  from  NEC 
to  define  the  source,  ground  losses  may  be  included  in  the 
final  output  patterns.  The  frequency  may  be  set  manually,  or 
if  a  NEC  source  is  used,  the  frequency  is  set  to  that  used  in 
the  NEC  analysis.  Next,  the  user  may  define  azimuth  and 
elevation  patterns  as  well  as  a  totally  arbitrary  pattern  cut  if 
so  desired. 

At  this  point  the  model  is  complete  and  ready  to  be 
analyzed.  All  of  the  parts  of  the  model  generated  by  the 
previous  operations  are  combined  in  an  input  file,  and  the  simulation  is  analyzed  by  the  NECBSC  program.  Azimuth  an 
Elevation  plotting  routines  are  included  in  the  package  to  display  the  simulated  results. 

Once  the  model  is  complete  the  user  does  not  have  to  go  through  all  the  steps  from  start  to  finish  ever>'  time. 
He  may  go  back  and  alter  one  or  more  of  the  model  creation  steps  by  running  the  command  again  with  different  inputs 
and  then  rerun  the  simulation  and  examine  the  resulting  patterns. 

As  an  aid  to  the  modeling  process  a  3D  wire-frame  viewer  was  developed  to  display  the  3D  plate  and  antenna 
structures.  3D  visualization  is  necessary  for  complex  models  to  verify  proper  orientation  and  placement  of  the  antermas. 

3.  Validation  Results 

The  PAINT  system  has  been  validated  by  comparing  simulated  results  to  accurately  measured  elevation  patterns 
taken  over  irregular  terrain  located  near  Cedar  Valley  Utah  [41.  The  terrain  data  is  .shown  in  Figure  2  and  a  simplified 
plate  model  is  shown  in  Figure  3.  The  antennas  in  Figure  3  have  been  exaggerated  to  increase  there  visibility.  There  are 
actually  6  antennas  at  the  site,  a  monopole  and  a  dipole  at  each  of  3  locations.  Only  one  antenna  was  active  during  each 
measurement.  The  relative  elevation  patterns  for  an  8  MHz  analysis  are  .shown  in  Figure  4.  The  antenna  locations  labeled 
front  top  and  back  in  Figure  4  correspond  to  the  antenna  locations  shown  in  Figure  3  going  from  right  to  left  respectively. 
Figure  4  indicates  good  agreement  betw’een  the  measured  and  simulated  data  for  nearly  all  cases.  For  the  front  location 
the  hill  is  located  on  the  left  side  of  the  antenna.  The  blockage  caused  by  the  hill  causes  the  front  location  patterns  to  be 
reduced  on  the  left  side  at  low  elevation  angles.  The  top  location  is  not  obstructed  on  either  side.  The  back  location  is 
obstructed  by  the  hill  on  the  right  side,  but  less  so  than  the  front  location. 

Figure  5  shows  a  terrain  model  of  an  existing  HF  receiver  site  located  at  Rock  Springs,  PA.  This  site  was 
modeled  by  the  PAINT  system  using  DTED  terrain  data  on  CD-ROM.  No  measured  data  of  the  site  was  available  to 
compare  to  the  simulated  data,  so  the  PAINT  results  are  compared  to  NEC  results  for  a  flat  lossy  earth  model  (€^=12, 
0=6.5  mS.hn).  The  antenna  used  for  the  simulation  is  an  8-element  log-periodic  dipole  array  (LPDA)  shown  in  Figure 
6.  The  antenna  was  simulated  at  an  appropriate  height  over  flat  lossy  ground  using  NEC.  The  PAINT  system  utilized 
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Figure  1  Steps  in  the  PAINT  Modeling  Process. 
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Valley,  Utah  Site  with  Antennas 


the  currents  from  NEC  to  define  the  source  and  calculated  the  additional  effects  caused  by  the  irregular  terrain.  It  would 
be  vei7  difficult  to  simulate  this  antenna  with  the  NECBSC  program  without  the  NEC  hybrid  interface  used  by  the  PAIN  f 
system.  The  NEC  currents  include  the  effects  of  ground  losses  so  the  output  patterns  from  the  PAINT  system  represem 
an  actual  gain  in  dBi.  The  simulated  results  are  compared  in  Figure  7.  This  example  demonstrates  the  ability  of  the  3D 
PAINT  model  to  simulate  azimuth  as  well  as  elevation  patterns.  The  LPDA  is  pointed  down  the  terrain  slope  shown  m 
Figure  5  This  corresponds  to  the  left  side  of  the  elevation  plot  and  straight  up  on  the  azimuth  plot  shown  in  Figure  7. 
The  azimuth  pattern  was  taken  at  an  elevation  of  10  degrees  which  is  near  the  peak  of  the  pattern.  As  shown  in  both  plots 
of  Figure  7,  the  hill  in  the  background  significantly  corrupts  the  back-lobe  pattern  of  the  antenna.  1  he  hill  also  provides 
some  additional  gain  in  the  forward  direction  at  low  elevation  angles.  The  front -lobe  of  the  azimuth  pattern  and  the 
majority  of  the  elevation  pattern  are  in  good  agreement  with  the  NEC  flat  eanh  result. 

4.  Conclusions  and  Future  Work 

The  PAINT  system  is  a  very  powerful  tool  for  simulating  irregular  terrain  effects  on  antenna  patterns.  It  provides 
access  to  the  vast  amount  of  existing  terrain  database  information,  and  it  provides  the  tools  to  quickly  convert  the  raw 
terrain  information  into  a  useable  model ,  The  PAINT  system  provides  the  ability  for  an  unfamiliar  user  to  begin  studying 
the  effects  of  irregular  terrain  quickly  without  spending  a  long  time  learning  how  to  use  complex  and  unfriendly  software. 
By  using  a  3D  model,  the  PAINT  system  can  produce  azimuth  as  well  as  elevation  patterns. 

The  results  of  the  PAINT  system  compared  favorably  with  accurately  measured  elevation  data  taken  at  the  Cedar 
Valley  Utah  test  site.  The  PAINT  results  also  compared  favorably  with  NEC  simulations  of  the  Rock  Springs,  PA 
receiver  site.  While  no  measured  data  was  available  for  this  site  the  simulations  provide  insight  on  the  antenna  s 

performance  in  the  presents  of  irregular  terrain.  ,  .  .  •  i  ,■ 

The  PAINT  system  has  been  validated  primarily  at  HF.  At  lower  frequencies,  it  is  expected  that  the  simulation 
accuracy  will  suffer  due  to  the  lack  of  a  surface  wave  in  the  simulation.  At  higher  frequencies  the  simulation  is  limued 
by  the  ability  to  accurately  define  the  terrain.  As  the  frequency  increases,  more  plates  are  required  to  accurately  define 
the  terrain.  This  causes  the  simulation  run  time  to  increase  drastically,  because  the  simulation  time  increases  exponentially 
with  the  number  of  plates  in  the  model.  Another  limitation  of  the  simulation  is  that  currently  only  single  order  diffraction 
effects  are  considered  in  the  simulation.  This  prevents  pattern  predictions  into  regions  where  the  path  to  the  source  is 
shadowed  by  more  than  one  obstacle.  A  future  update  to  the  NECBSC  program  which  includes  higher  order  diffraction 
effects  may  correct  this  situation.  For  most  cases  these  limitations  do  not  prevent  the  PAINT  system  from  making  accurate 

simulations  at  HF  frequencies.  ^ 

One  topic  of  current  research  is  to  determine  the  best  method  of  terrain  simplification.  The  terrain  databas  s 
currently  available  contain  millions  of  points  of  elevation  data.  Currently  on  the  PC,  models  are  limited  to  a  few  hundred 
plates  or  less  due  to  the  time  required  to  run  the  simulation.  The  PAINT  system  contains  several  different  methods  ol 
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terrain  simplification,  and  work  is  being  done  to  determine  the  tradeoffs  between  the  amount  of  simplification  and  the 
simulation  accuracy  for  the  various  methods. 

The  PAINT  system  is  flexible  enough  to  allow  the  use  of  synthetic  user-generated  terrain  data  as  a  source  of 
terrain  information  for  model  creation.  This  allows  a  wide  range  of  canonical  problems  such  as  gaussian  hills,  semi¬ 
circular  bosses,  or  any  analytic  terrain  surface  to  be  modeled.  Future  work  may  be  directed  toward  analyzing  some  of 
these  synthetic  shapes  and  comparing  the  PAINT  models  to  the  results  of  other  simulation  techniques. 
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Abstract 

For  a  conducting  strip,  a  precise  closed  form  characterization  of  the  field  distribution  for 
arbitrary  shaped  edges  and  any  conductor  thickness  is  not  possible.  A  matched  asymptotic 
expansion  and  a  finite  element  code  are  used  to  study  the  electromagnetic  fields  local  to  the  edge 
of  a  conducting  strip.  This  formulation  is  shown  to  be  valid  for  any  edge  shape  and  strip  thickness 
versus  skin  depth.  In  this  paper,  this  formulation  is  discussed  and  results  for  the  90  degree  edge 
arc  presented. 


Introduction 

Planar  circuits  are  made  by  laying  strips  of  conductor  (microstrip,  coplanar  waveguides, 
etc.)  on  a  grounded  substrate  as  shown  in  Figure  1 .  As  signals  propagate  along  these  microstrip 


cross-section  A-A' 

Figure  1.  Planar  circuit  example  showing  layout  of  microstrip  lines 
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lines,  power  is  lost.  The  power  loss  is  proportional  to  a  number  of  factors  such  as  conductor 
thickness  t,  skin  depth  S,  and  the  edge  shape  as  shown  in  Figure  2.  While  the  conductor  power 
loss  is  in  general  a  small  effect,  it  increases  as  the  operating  frequency  increases.  Con.sequently, 
considerable  effort  has  been  expended  in  trying  to  characterize  it  [1-6]. 


Figure  2.  Cross  section  of  the  isolated  edge. 


Recently,  Holloway  and  Kuester  [7]  used  the  method  of  matched  asymptotics  to  propose 
quasi  clo.sed  form  expressions  for  the  conductor  loss  for  planar  structures  as  a  function  of  t,  5,  and 
the  edge  shape.  They  studied  losses  from  an  isolated  conducting  edge.  Their  expressions  yielded 
excellent  results  which  agreed  well  with  that  of  Heinreich  [2],  Goldfarb  [3],  and  Wheeler  [4].  In 
[7],  the  distribution  of  the  electromagnetic  fields  around  the  edge  of  the  conductor  was  not 
studied  explicitly.  The  emphasis  was  on  computing  the  power  loss.  In  this  paper,  we  use  the 
method  of  matched  asymptotics  to  explicitly  study  the  field  distribution  around  the  isolated  edge 
of  a  conducting  strip.  The  field  distribution  for  a  90°  edge  will  be  investigated  for  various  t/5 
ratios. 


Formulation 


We're  interested  in  finding  the  fields  inside  the  strip  conductor  local  to  the  edge  (see 
Figure  2).  Assuming  a  dominant  current  along  the  z-axis  we  have  the  TM  polarized  field 
components  E„  Hx,  and  Hy,  Maxwell's  equations  then  reduce  to 


and  _ 

VxH  -  jwE^,,.a,E^ 

in  which  the  subscripts  d,c  refer  to  dielectric  (air  in  this  case)  and  conductor,  respectively.  The 
boundary  conditions  on  the  electric  and  magnetic  fields  are  given  by 

e:  L  =  Ic 

and  _  __ 

H:'  I  :=Hf  \ 
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where  the  subscript  t  refers  to  tangential  components.  Next  the  electromagnetic  fields  can  be 
written  in  terms  of  the  magnetic  vector  potential  A-aA^  as 

/7  =  — Vx(fl.  /t) 

|i  ^  ‘  ^ 


and 


E,  --  jwA 


The  magnetic  vector  potential  must  then  satisfy  the  modified  Helmholtz  equation 

(V"  =0 


with  the  following  boundary  conditions 


and 


_L_L 

[a,,  d  n 


_L_L 

Pj  a  n 


A‘‘ 


In  [7],  the  method  of  matched  asymptotics  is  used  to  expand  the  fields  in  terms  of  the 
small  parameter  v  =kj  12,  where  k^|  is  the  wave  number  in  the  dielectric,  and  t  the  strip 
thickness.  In  the  outer  region,  far  from  the  edge,  and  in  the  dielectric,  they  expanded  the  potential 
as 

/i"  ~7''’(A-,y)  +  v  T  '(A-,>’)+v'r'(.v,y)  +  C>(v')  (1) 

and  far  from  the  edge,  inside  the  conductor,  they  expanded  the  potential  as 

A^  ~V'\x,y)+v  V  \x,y)  +  v^U'^{x,y)  +  0{v^)  (2) 

V 

where  the  scaled  variable,  V  =  — ,  was  used  to  account  for  the  possible  rapid  field  variations  in 

V 

the  y-direction.  T  and  U  must  then  satisfy  the  following  modified  Helmholtz  equations: 


-^7+-^r+A-:T  =  o 

d  X'  d  y‘ 


d- 


d  v' 


U+(Si'e„GU 


=  0 


and  boundary  conditions 


r(.,20  =  4..±T) 


and 

1  3  7’(A-,y)  _  1  3fy(A,y) 

l-i.  3y  p,v  ay 


(3) 


(3a) 
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where  G  represents  the  relative  permittivity  in  the  conductor. 


Similarly,  in  the  inner  region  close  to  the  edge,  the  potential  was  expanded  m  an 
asymptotic  series.  As  pointed  out  in  [71,  the  fields  in  the  inner  region  must  be  expanded  in  half 
powers  of  v  as 

A‘‘  +  (4) 

in  the  dielectric  and  as 

X 

in  the  conductor.  The  second  scaled  variable,  ^  ’  was  introduced  to  account  for  the  rapid 

field  variations  along  x.  V  and  W  must  satisfy  the  following  modified  Helmholtz  equations 

(V'+A;jv')K=()  (^) 

(V'+A-^)[V'  =  0 

and  corresponding  boundary  conditions 

V/|^  (6a) 


where 


dW  ^ 
d  n  d  n 

A.  refers  to  the  normal  derivative  along  the  conductor  dielectric  interface,  and 
diJ 


A- 

dy' 


Zeroth  and  first  order  solutions 

Substituting  the  expansions  for  T(x,y)and  U(x,y)  given  by  (1)  and  (2)  into  the 
modifieel  Helmholtz  equation  (3),  it  can  be  shown  that  U°=0,  and 

U'  -  +  B, .  For  the  fields  in  the  dielectric,  7" 

T'  must  satisfy  (V'+ytj)r‘  =0.  A, (a-),Bi(x),  and  f„  are  determined  from  the  boundary 
conditions  given  in  (3a). 
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A  similar  process  using  Equations  (4)  through  (6)  for  the  zeroth  order  fields  in  the  inner 
expansions  yields  =  0  and  W,,  =  0 .  The  first  order  fields  must  satisfy  the  Laplace  and 
Helmholtz  equations 


=0 


(7) 


where 


{v-+k;-,)p,  =0 


/i 


(7u) 


J\  V 


are  subject  to  the  boundary  conditions 

T’llc-Gilc  =P'^"«n| 


=  n  ^2  <;j  ril  ^  I  —  ^ (a:,  y ) 


(7b) 


^  n 


r 

c 

1  \2)) 

D{x,y) 


Numerical  Results 


The  solution  to  the  system  of  equations  in  (7)  present  some  unusual  features.  These 
features  were  treated  by  Holloway  in  [8J  where  the  general  class  of  Eddy  current  problems  was 
addressed.  The  solution  shown  in  |X]  utilized  a  variational  technique  and  developed  a  functional 
for  the  system  in  (7).  The  functional  as  given  in  JK]  and  [9]  is 


U-q  /",  )  =  ! {yp‘'' dV  -k ^  j dV  +  j  [VQ‘;  f  dV 

i'  V 


Gi") 


dp;^ 

dn 


■dS 


■j{pr-Qi‘ 


^  dn 


dS 


(«) 


where  the  'Lr'  superscript  denotes  trial,  and  V  and  D  are  the  Jumps  in  the  potential  and  its 
derivative  as  given  in  equation  7(b). 

A  finite  element  code  was  then  written  to  implement  the  solution  of  (8).  Results  for  the 
fields  within  the  conductor  clo.se  to  the  edge  are  shown  in  Figures  3  and  4  for  the  90°.  In 
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Figures  3  and  4,  we  see  the  fields  decay  slowly  for  the  thickness  t  ~  8.  As  the  thickness  increased 
to  t  ~  25 , 45  ,  and  10  5  ,  we  see  that  the  fields  appear  to  drop  off  in  a  somewhat  exponential 
manner  along  y  as  expected.  Similar  results  can  be  obtained  for  the  45°  edge  but  are  not  included 
here. 


Figure  4.  3-D  view  of  the  fields  for  the  90°  edge  for  t/5=4. 


Conclusion 


The  electromagnetic  fields  local  to  the  edge  of  a  conducting  strip  cannot  be  characterized 
in  a  closed  form.  The  matched  asymptotic  technique  with  the  use  of  finite  elements  allow  these 
local  fields  to  be  approximated  in  a  very  reasonable  manner. 
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Abstract 

A  procedure  for  tlie  adaptive  definition  of  a  Finite  Element  mesh  matching  an  user- 
defined  error  level,  on  the  basis  of  an  error  estimate  on  a  first  Finite  Element  solution, 
is  presented.  The  proposed  procedure  sets  up  a  tentative  mesh  defining  an  initial 
distribution  of  nodes  on  the  boundary  on  the  basis  of  the  estimated  error  and  refines 
the  mesh  until  an  error-based  criterion  in  the  bulk  of  the  domain  is  satisfied.  The 
implementation  of  the  procedure  in  a  2D  Finite  Element  development  environment  is 
presented  and  obtained  results  are  discussed. 


1.  INTRODUCTION 

Finite  Element  solutions  of  electromagnetic  problems  are  becoming  widely  available  and  used  to 
deal  with  a  variety  of  problems,  ranging  from  electrostatic  and  magnetostatic  ones  to  those  relevant  to 
high  frequency  devices,  also  including  ncailinear  media. 

One  of  the  problems  that  still  hinders  the  diffusion  of  Finite  Element  applications  for  complex,  real- 
life  design  problems,  now  more  and  more  allordable  in  tenii  ot  computer  resources,  is  certainly  the  need 
to  discretize  the  problem  domain.  This  phase  of  the  solution  process,  crucial  to  obtain  an  adequate 
accuracy,  is  often  rather  iiivolved,  particularly  with  the  complex  and  multi-material  domains  Irequently 
found  in  real  electromagnetic  devices,  and  generally  requires  significant  user  skills  to  provide  an 

adequate  .solution.  ... 

A  possible  approach  to  tackle  this  problem  is  to  perlorm  a  lirst  initial  solution  on  a  rough, 
automatic  mesh,  to  estimate  the  error  of  the  solution  with  an  "a  posteriori"  estimation  algorithm,  to 
"adapt"  the  me.sh,  refining  it  where  the  error  estimate  is  higher,  to  compute  a  new  soluticui  and  to  iieiate 
the  procedure  until  the  error  estimate  falls  below  a  u.ser-delined  level.  Becau.se  ot  the  signilicant  potential 
advantages  of  this  approach,  the  research  on  error  estimation  and  adaptive  meshing  techniques  is  very 
active  .since  many  years,  in  a  .seiies  of  directions  [  1  -7J. 

The  most  followed  strategies  for  mesh  adaption  in  Finite  Element  codes  generally  build  succe.ssive 
improved  meshes  by  refining  the  previous  one  under  the  guidance  ot  an  error  estimation  algorithm.  The 
refinement  can  he  obtained  by  adding  new  elements  (h  irfinemenr),  by  raising  the  order  ot  the  involved 
elements  (p  l  efiacnu'iU),  by  moving  the  position  ol  the  existing  nodes  (r  reflncnieiit)  oi  by  combinations 
of  the  above  approaches.  A  number  of  different  algorithms  for  error  estimation  and  mesh  adaption  in 
computational  electromagnetics  have  been  proposed,  as  mentioned  above,  including  some  developed  and 
tested  by  the  group  of  the  authors  [8-11]. 
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2.  MOTIVATIONS  AND  STRUCTURE  OF  THE  PROPOSED  PROCEDURE 


In  this  paper,  on  the  basis  of  the  previous  experience  in  error  estimation  and  adaptive  meshing  by 
the  research  group  of  the  authors,  a  rather  innovative  approach  to  the  adaption  problem  is  proposed.  In 
this  approach,  after  the  first  solution,  instead  of  refining  the  initial  mesh,  a  complete  remeshing  of  the 
domain  is  performed,  on  the  basis  of  the  eiTor  estimation  on  the  first  mesh. 

A  major  disadvantage  of  remeshing  algorithms,  that  is  probably  at  the  basis  of  their  limited  current 
diffusion  in  engineering  adaption  procedures,  is  that  they  "forget"  the  previous  node  placement 
information,  and  have  then  to  face  each  time  a  meshing  from  scratch,  using  geometrical  and  error 
estimation  data  only.  This  implies,  to  allow  an  efficient  usage  in  "general  purpose"  computational 
electromagnetic  codes,  that  a  very  sound  and  reliable  automatic  meshing  algorithm  must  be  available,  and 
that  also  the  exploitation  of  eiTor  estimation  data  must  be  reliable  and  robust,  to  allow  a  proper  handling 
of  the  wide  variety  of  multi-region,  intricate  domains  of  practical  interest.  However,  if  the  above 
requirements  can  be  satisfied,  there  are,  in  the  opinion  of  the  authors,  some  significant  potential 
advantages  over  more  "traditicmal"  refinement  techniques: 

i)  a  remeshing  procedure  can  also  reduce  the  density  of  nodes,  correcting  possible  "meshing 
overkills"  performed  by  the  initial  mesh; 

ii)  in  an  efficient  remeshing  procedure  a  single  iteration  is  frequently  enough  to  satisfy  the  eiTor  level 
required  by  the  u.ser,  and  in  general  a  lower  numher  of  iteration  can  be  expected; 

iii)  in  time-varying  problems  involving  time  di.scretization,  where  areas  requiring  high  accuracy  can 
vary  with  time,  tlie  feature  of  "forgeiiing"  previous  meshings  can  turn  into  an  advantage. 

On  the  basis  of  previous  experience  of  their  research  group  in  the  area  of  error  estimation  and  h- 
refinement,  and  of  some  initial  encouraging  results  obtained  with  a  first  remeshing  algorithm  [12],  the 
authors  have  devised  the  twt)-dimensional  enhanced  remeshing  procedure  pre.sented  in  this  paper.  In 
order  to  assess  carefully  its  potential  under  conditions  easy  to  evaluate,  the  current  version  of  the 
procedure  is  relevant  to  magnetostatic  and  electrostatic  problems  only,  and  is  implemented  with  first 
order  triangular  elements  in  a  single  material  domain. 

The  procedure  starts  with  an  initial  solution  obtained  on  a  mesh  generated  by  means  of  an 
automatic  meshing  routine  and  with  a  subsequent  element-by-element  error  estimation  over  the  whole 
domain  [13-16J.  On  the  basis  of  the  error  estimate  on  each  element,  a  "sizing  function"  on  the  problem 
domain  is  defined.  This  function  depends  also  on  a  desired  error  level  del'ined  by  the  u.ser  and  indicates 
an  "optimal"  local  size  of  triangle  for  the  given  error  level.  For  elements  abutting  on  the  boundary,  the 
estimated  error  is  averaged  and  a.s, signed  to  boundary  and  interface  nodes;  the  values  of  nodal  error 
evaluated  in  this  way  are  then  u.sed  to  define  along  the  boundaries  of  the  geometry  a  "spacing  function", 
also  depending  on  the  u.ser-speeified  error  previously  mentioned. 

The  generation  of  the  new  mesh  is  then  started,  defining  firstly,  by  means  of  the  spacing  function, 
the  number  of  nodes  along  each  side  of  the  problem  boundary  and  their  mm-uniform  distribution.  Once 
the  position  of  nodes  along  all  boundary  sides  has  been  selected,  a  first  triangulation  is  performed  by 
means  of  the  usual  Delaunay  criterion.  Then  a  loop  is  started  to  del'ine  the  triangles  in  which  a  node 
should  he  added;  the  criterion  for  node  addition  is  ha.sed  on  the  sizing  function  previously  defined.  Ft)r 
each  new  node  a  local  reshaping  of  the  mesh  is  perl’onned  to  minimize  badly  shaped  triangles;  the  loop  is 
completed  when  all  triangles  in  the  mesh  are  marked  as  small  enough  t)n  the  basis  of  the  sizing  function. 
The  mesh  obtained  in  this  way  is  then  subjected  to  a  final  "smoothing"  with  geometrical  criteria,  to 
complete  the  procedure. 

To  allow  a  deeper  evaluation  of  the  procedure,  the  algorithms  u.sed  to  perfonn  the  various  steps  of 
mesh  definition  are  described  in  detail  in  the  following  section. 
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3.  MESn  DEFINITION  ALGORITHM 


The  mesh  definition  algorithm  is  launched  when  a  first  solution  on  an  initial  mesh  has  been 
performed,  and  an  estimation  of  the  error  ol  the  obtained  solution  has  been  computed.  The  enor 
estimation  is  assumed  to  be  available  separately  on  each  element  of  the  domain  as  a  constant  value  as 
provided  by  the  error  estimators  previou.sly  mentioned  [14-16]. 


3.1  Evaluation  of  the  sizing  function 

To  define  the  sizing  function  on  the  problem  domain,  an  "enor  weighted  element  size"  is  defined 
over  each  element  k  as: 

(1) 

('k 

where  is  the  area  of  the  element,  e,,j-  is  the  reference  error  level  defined  by  the  user  and  is  the 
error  estimate  on  the  element  These  quantities  are  then  used  to  assign  a  "weighted  nodal  size"  value  Fj 
to  each  mesh  node  as: 


F-  = 


t=i 


_L 

N 


(2) 


where  /  is  the  generic  node,  the  upper  summation  is  performed  over  the  /??  elements  of  the  region  of 
support  of  the  node  and  the  lower  one  over  the  N  elements  of  the  whole  problem  domain. 

Once  the  E,  values  have  been  computed  for  every  mesh  node,  the  "sizing  function"  S(.x,y)  can 
be  computed  in  every  point  of  the  problem  domain  using  the  element  shape  functions 

3.2  Evaluation  of  the  spacing  function 


To  evaluate  the  spacing  function,  the  averaged  nodal  error  on  the  boundary  nodes,  E,  is  first 
defined  as: 


0) 


with  the  same  meaning  of  symbols  as  in  eqs.  (1)  and  (2). 

On  each  boundary  side  having  P  nodes  in  the  initial  mesh,  and  defining  .s'l  and  .V/,  as  the  one 
dimensional  coordinates  of  the  initial  and  linal  nodes  of  the  boundary  side,  nodal  values  of  the  spacing 
function  are  evaluated  as: 


D(s,)  = 


2f.s>--vW 


(P-I) 


/-- 


uf  J 


(4) 


The  spacing  function  (l(s)  is  then  defined  over  the  boundary  side  as  a  piecewise  linear  function 
assuming  the  values  of  eq.  (4)  at  the  nodes. 
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3.3  Boundary  node  definition 


The  first  step  to  define  the  boundary  nodes  of  the  new  mesh  along  a  boundary  side  is  to  compute 
the  required  number  of  nodes  along  that  side,  T,  on  the  basis  of  the  spacing  function.  This  number  is 
computed  for  each  boundary  side  r  as; 


(5) 


The  one-dimensional  coordinates  sj  of  the  generic  node  i  along  the  boundary  side  of  the  new 
mesh  is  then  defined  by  the  relation  : 


•V,  = 


:('h+/  +  -V 


)4(.v,. 


(6) 


where  d  is  the  spacing  function  defined  above  and  nij  is  the  midnode  one  dimensional  coordinate  of 
node  i  defined  as: 


1 

2 


(7) 


The  values  of  ,v/  along  each  side  of  the  boundary  are  then  computed  solving  the  set  of  nonlinear 
equations  defined  by  eq.  (6),  as  propo.sed  by  Frey  [17].  Once  the  boundary  node  definition  has  been 
completed,  an  initial  triangulation  using  these  node  only  is  performed  and  optimized  using  a  standard 
Delaunay  algonthm. 


3.4  Bulk  node  definitioii 

To  define  the  position  of  internal  nodes,  on  every  element  of  the  intermediate  mesh  built  up  as 
outlined  in  tlie  previous  subsection  a  test  is  performed  to  dieck  if  the  circumcentre  is  internal  to  the 
element.  If  not,  the  triangle  is  not  considered  a  suitable  candidate  for  a  node  placement  since  it  is  likely  to 
be  disrupted  by  the  Watson  algoiithm  applied  to  the  suirounding  ones  [17,18]. 

If  the  triangle  passes  this  test,  a  tentative  new  node  is  placed  along  the  segment  between  incenlre 
and  circumcentre,  and  a  series  of  steps  are  taken  to  check  if  the  tentative  node  should  be  maintained  or 
removed. 

The  first  step  is  the  del'inition  of  a  region  of  suppint  of  the  new  node,  using  the  classic  Watson 
algorithm  [18],  and  building  /  "virtual  liiangles"  centred  on  tlie  new  m)de. 

For  each  triangle  k,  the  sizing  function  yi  defined  in  subsection  3. 1  is  computed  at  the  nodes 
and  assumed  to  vary  over  the  area  of  the  triangle  according  to  the  element  shape  function.  An 
"optimal  local  size"  ha.sed  on  the  eiror  infonuation  contained  in  the  function  S(x,y)  is  then  computed  as: 

nr  (8) 


For  each  of  the  above  virtual  triangles  the  ratio  X  between  the  optimal  and  the  real  local  size  is 
computed  as: 


r 


(9) 


If  for  every  virtual  triangle  of  the  area  of  support  X  is  smaller  than  one,  the  tentative  node  is  deleted 
and  the  original  triangle  labelled  as  pennanent;  il'not,  the  node  and  the  virtual  triangles  are  inserted  in  the 
mesh.  The  process  is  completed  when  all  triangles  are  labelled  as  permanent. 

The  mesh  originated  by  the  above  procedure  is  then  subjected  to  a  final  "smoothing",  using  a 
rubber  handing  technique, to  improve  tlie  aspect  ratio  of  tiiangles  if  required. 
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4.  TEST  CASES 

Two  test  cases  are  presenied: 

•  L-shaped  problem,  olleii  used  for  eiTor  esiimate  validation  since  it  presents  a  singular  point  and  it  has 
a  known  solution; 

•  a  geometry  which  has  singular  points,  null  field  zones  and  homogeneous  field  zones  distributed  on 
it. 

For  each  test  case  the  initial  mesh  and  the  final  mesh  are  shown. 

In  holh  ca.ses  the  error  requested  on  the  solution  was  the  \%  and  the  value  was  reached  with  a 
single  remeshing  loop. 


Fig  1:  L-shaped  problem  initial  mesh  (23  nodes,  30  elements) 


Fig  2:  L-shaped  problem  final  mesh  (143  node.s,  240  elements) 


Fig.  3;  Tc.st  case  2  initial  mesh  (435  nodes,  77H  element.s) 


Fig.  4:  Test  case  2  I'inal  mesh  (344  nodes,  561  elements) 


5.  CONCLUSIONS 

The  remeshing  algorithm  presented  in  this  paper  has  proven,  in  the  test  eases  perrormed  so  far, 
rather  reliable  and  efficient,  providing  good  quality  results  also  with  rather  coarse  initial  meshes.  With 
accuracy  requirements  adequate  for  many  initial  design  purposes,  the  procedure  has  provided  in  most 
ca.ses  a  mesh  matching  the  requirements  in  a  single  iteration.  Even  if  a  remeshing  iteration  is  somewhat 
computationally  heavier  than  an  usual  h-  or  p-type  adaption  one,  this  .seems  to  indicate  a  high  likehood  of 
computational  savings  lor  the  remeshing  approach,  since  more  "classical"  adaption  procedures  usually 
require  three  to  five  iterations  or  more  in  similar  ca.ses. 

It  coupled  with  reliable  and  efficient  automatic  me.shing  routines  and  with  accurate  and  robust  eiror 
estimation  algorithms,  the  remeshing  procedure  here  described  appears  promising  to  set  up  an  efficient 
"accuracy  driven  problem  solution’'.requiring  to  the  u.ser  only  the  .setting  of  accuracy  levels  desired  and 


no  other  intervention  to  define  the  mesh.  However,  luriher  tests  are  necessary  in  more  realistic  design 
stiuctures  and  in  a  lareer  variety  of  cases  to  test  the  ability  ol  the  procedure  to  reach  this  ambitious  goal. 

Further  activity  is  also  required  to  extend  the  coverage  beyond  tbe  electrostatic  and  magnetostatic 
ca.ses  tested  so  far,  particularly  to  problems  involving  time  discretization,  tor  whose  solution,  it  reliable 
etTor  estimators  are  available,  the  features  ot  the  proposed  algonthm  appear  particularly  promising. 
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Abstract 

This  paper  presents  and  applies  a  new  formulation  for  finite  element  computation  of  propaga¬ 
tion  constants  and  mode  shapes  in  a  wide  variety  of  waveguiding  structures.  The  formulation  is 
novel  in  that  its  solution  variables  are  the  two  components  of  the  magnetic  vector  potential  in  the 
cross-sectional  waveguide  plane  and  the  electric  scalar  potential.  These  variables  form  the  ei¬ 
genvectors  of  an  eigenvalue  problem  in  which  the  eigenvalues  are  related  to  propagation 
constants  at  the  frequency  of  interest.  The  new  finite  elements  are  applied  to  inhomogeneous 
waveguide  problems,  including  isotropic  and  anisotropic  microstrip  lines,  and  the  computed  high 
frequency  propagation  constants  are  shown  to  agree  closely  with  those  of  previous  papers.  The 
new  formulation  also  obtains  correct  low  frequency  propagation  constants  and  mode  shapes. 

INTRODUCTION 

Computation  of  waveguide  dispersion,  i.e.,  propagation  constant  versus  frequency,  has  been 
carried  out  using  finite  elements  for  several  years.  A  good  summary  of  the  various  existing  tech¬ 
niques  has  been  recently  presented  in  [1],  where  a  total  of  six  different  formulations  are  examined 
and  compared.  Included  are  formulations  by  Lee  et  al  [2],  by  Hano  [3],  and  by  Koshiba  et  al  [4]. 
All  six  examined  formulations  use  components  of  either  the  electric  field  or  the  magnetic  field 
as  their  primary  solution  variables,  as  do  more  recent  formulations  [5],  [6]. 

This  paper  presents  a  new  formulation  that  for  the  first  time  uses  potentials  as  the  primary  solu¬ 
tion  variables,  not  the^elds  themselves.  The  potentials  used  are  the  two  components  of  the  mag¬ 
netic  vector  potential  A  and  the  electric  scalar  potential  4).  The  vector  potential  is  used  as  an  edge 
or  tangential  variable  in  edge— based  finite  elements,  while  the  scalar  potential  is  nodal— based. 
Use  of  the  edge-based  vector  potential  eliminates  spurious  modes  in  the  desired  solution  spec¬ 
trum  and  allows  accurate  analysis  of  waveguides  with  sharp  interior  conducting  corners. 

The  first  part  of  this  paper  uses  Galerkin  techniques  to  derive  the  eigenvalue  equation  in  terms 
of  the  vector  and  scalar  potentials.  Next,  the  inhomogeneous  rectangular  waveguide  of  Hano  [3] 
is  analyzed  by  the  new  method.  Finally,  the  two  microstrip  lines  of  Koshiba  et  al  [4]  are  analyzed. 
The  isotropic  microstrip  line  is  analyzed  over  an  extremely  broad  frequency  range,  and  the  aniso¬ 
tropic  microstrip  line  is  also  analyzed.  High  frequency  propagation  constants  computed  here  are 
compared  with  those  of  others.  Also,  low  frequency  propagation  constants  are  computed  for  the 
isotropic  microstrip  line. 
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THEORY 

The  new  formulation  is  implemented  in  MSC/EMAS™,  an  electromagnetic  analysis  software 
package  with  3D,  2D,  ID,  and  OD  finite  eluents  [7],  [8].  Its  solution  variables  in  3D  are  the  three 
components  of  magnetic  vector  potential  A  and  the  time — integrated  electric  scalar  potential.  To 
compute  propagation  constants  for  2D  waveguide  cross-sections,  we  choose  the  transverse  com¬ 
ponents  of  the  magneticvector  potential  A  and  the  electric  scalar  potential  (})  (volts).  Electric  field 
is  then: 

E  =  -  JcoAt:  -  Vcj)  (1) 


We  transform  the  potentials  as  follows,  where  cq  is  the  speed  of  light  in  vacuum: 

-  -  ,  (t) 

A't  =  jAx  ^  ^ 


(2) 


Substituting  (2)  in  (1),  and  denoting  transverse  components  by  t  and  the  complex  propagation 
constant  in  the  longitudinal  z  direction  by  y  =  a  +  j  p: 


E  =  -  a)A'x  -  CflVxCj)'  +  YCocj)'  z 

Ex  =  -  coA'x  -  CqVx({)'  Ez  =  YCo<l>' 

The  magnetic  flux  density  is  defined  by  Faraday’s  Law: 

B  =  -  j((Vt  -  Y  z)  X  A'xj  =  -  j  (Vx  X  A'x  -  Y  z  X  A'x) 

Bz  =  —  j  |Vx  X  A  xj 


Bx  =  j  y(z  X  A'x 


_  /Eo_  1 


£nC 


o'-o 


u  _  0) 

*^0  -  c7 


Using  the  above  relations,  Ampere’s  Law  gives  (where  reluctivity  is  v): 

-  j  V  X  [vr](v  X  A' j  +  cop(j[o]A'x  +  hMW')  +  jk§[er]A'x  +  jko[Ex](V4)')  =  0 

The  continuity  equation  gives: 

jcOCoEQ  Vx  •  [Er]Vx4)'  +  jcO^EQ  ‘  [^r]A'x  +  CqVx  •  [a]Vx({)'  +  COVx  '  [ojA'x]  ^  ^ 

+  Y^(  jcOCoEoEzz,  <^'  +  CpOzz  cf)')  J 


(3) 

(4) 

(5) 

(6) 

(7) 

(8) 

(9) 


Galerkin’s  method  is  now  used.  Testing  with  j  6A'x  and  with  can  be  shown  to  yield 

an  equation  in  terms  of  the  square  of  the  propagation  constant: 


[K] 


(10) 


where  [K]  and  [M]  are  symmetric  indefinite  matrices.  The  eigenvalues  of  (1 0)  are  the  propagation 
constants  and  the  eigenvectors  define  the  modal  fields.  The  matrices  of  (10)  have  been  derived 
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for  first  order  and  second  order  quadrilateral  and  triangular  finite  elements  in  MSC/EMAS  for 
the  case  of  propagating  modes,  where  y  =  j  (3.  In  this  case  the  eigenvalues  of  (10)  are  real. 

APPLICATION  TO  INHOMOGENEOUS  RECTANGULAR  WAVEGUIDE 

Here  we  present  the  results  of  the  above  formulation  for  the  inhomogeneous  rectangular  wave¬ 
guide  of  Hano  [3].  It  is  modeled  using  quadrilateral  finite  elements  and  analyzed  by  MSC/EMAS. 

Fig.  1  shows  the  model  made  up  of  40  first  order  quadrilaterals.  Hano’s  dimension  h  is  here 
set  to  0.1  meter.  Note  that  half  of  the  guide  is  filled  with  air.  The  other  half  is  filled  with  a  dielec¬ 
tric  material  e  that  has  permittivity  4  times  air.  Figure  1  shows  the  computed  E  and  H  fields  for 
the  first  two  modes  at  1.91  GHz,  which  corresponds  to  Hano’s  ko  h  =  4. 


Fig.  1.  Finite  element  model  of  inhomogeneous  rectangular  waveguide  and 
E  (upper),  H  (lower)  fields  computed  at  1.91  GHz.  a),  model,  b).  mode  2. 

Table  1  lists  the  propagation  constants  computed  at  three  different  frequencies.  Note  that  at 
ko  h  =  2.5,  only  the  fundamental  mode  exists.  At  ko  h  =  4,  three  modes  propagate,  and  six  modes 
propagate  at  ko  h  =  5.  The  propagation  constants  of  Table  1  agree  very  closely  with  those  of 
Hano’s  graph  [3].  Unlike  Hano’s  paper,  no  spurious  modes  were  observed  using  the  formulation 
of  this  paper. 

Table  1.  MSC/EMAS  computations  of  inhomogeneous  rectangular  waveguide 


27.365 

63.702,61.196,  47.381 

85.499,  85.200,  74.142,  36.959,  33.045,  29.665 
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APPLICATION  TO  MICROSTRIP  LINE 

Here  we  present  the  results  of  the  above  formulation  for  the  inhomogeneous  microstrip  trans¬ 
mission  line  of  Koshiba  [4].  It  is  modeled  using  quadrilateral  finite  elements  and  analyzed  by 
MSC/EMAS. 

Fig.  2  shows  the  microstrip  model  made  up  of  80  first  order  quadrilaterals.  As  specified  by  Ko¬ 
shiba,  the  width  w  of  the  strip  is  1 .27  mm,  and  the  dielectric  height  h  has  the  same  dimension.  Note 
that  the  model  is  very  coarse  in  that  the  strip  is  only  two  elements  wide. 

Fig.  2  also  shows  the  E  and  H  fields  computed  for  the  first  two  modes  at  30  GHz.  These  are 
for  Koshiba’s  first  case  where  the  substrate  has  an  isotropic  relative  permittivity  of  8.875.  Table 
2  lists  the  computed  propagation  constants  at  this  frequency  and  at  lower  frequencies.  Note  that 
at  the  low  frequencies,  only  the  fundamental  quasi -TEM  mode  exists.  Table  2  also  lists  the  prop¬ 
agation  constants  computed  here  for  Koshiba’s  second  case,  which  is  anisotropic.  The  permittiv¬ 
ity  tensor  is  assumed  to  be  diagonal,  with  a  relative  permittivity  of  9.4  in  the  x  and  z  directions, 
and  11.6  in  the  vertical  y  direction. 


a)  b) 


Fig.  2.  Finite  element  model  of  microstrip  and  computed 
E  (upper),  H  (lower)  fields  for  isotropic  dielectric  at  30  GHz.  a),  model,  b).  mode  2. 
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Table  2.  MSC/EMAS  computations  of  microstrip  line,  where  +=  additional  mode(s) 


um 

Isotropic  pG/ml 

Anisotropic  p  H/ml 

1  k 

5.18E-5 

not  analyzed 

10  k 

5.11E-4 

not  analyzed 

1  M 

5.11E-2 

not  analyzed 

5G 

263 

294 

10  G 

545 

615 

15  G 

840,  238,  224 

954,  244,  4- 

20  G 

1144,  580,  + 

1304,  724,  + 

25  G 

1454, 1014,  + 

1661, 1204,  + 

30  G 

1767,  1315,  + 

2021, 1652,  + 

The  propagation  constants  of  Table  2  can  be  compared  to  those  computed  by  Koshiba.  Both 
the  results  of  his  original  method  [4]  and  his  recently  improved  method  [9]  will  be  compared  to 
the  Table  2  propagation  constants  for  the  first  two  modes. 

Fig.  3  shows  the  results  for  the  first  isotropic  mode  over  frequencies  up  to  30  GHz.  Note  that 
the  results  here  appear  to  be  identical  to  those  of  Koshiba.  However,  Fig.  4  is  a  detail  near  the 
origin.  Note  that  the  MSC/EMAS  results  are  a  straight  line,  whereas  Koshiba’s  results  are  not. 
Because  the  first  mode  is  TEM,  the  slope  should  be  constant  at  these  frequencies.  Hence  the 
MSCEMAS  formulation  is  more  accurate  than  Koshiba’s  at  low  frequencies. 


Fig,  3.  Isotropic  microstrip,  mode  1 
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2.5 


Fig.  7.  Anisotropic  microstrip,  mode  2 


Fig.5  compares  propagation  constants  computed  for  the  second  mode  in  the  isotropic  case. 
Note  that  the  MSC/EMAS  results  disagree  somewhat  from  Koshiba’s  results  for  this  mode. 

Figs  6  and  7  compare  propagation  constants  computed  for  the  two  modes  in  the  anisotropic 
case.  Note  again  that  the  MSC/EMAS  results  agree  closely  with  Koshiba’s  results  for  the  first 
mode,  but  disagree  somewhat  from  Koshiba’s  results  for  the  second  mode. 

CONCLUSIONS 

New  finite  elements  have  been  developed  for  computation  of  propagation  constants  and  modal 
fields  in  waveguiding  structures.  The  use  of  edge-based  magnetic  vector  potential  and  nodal- 
based  electric  scalar  potential  appears  to  be  advantageous  in  that  low  frequency  results  for  micro¬ 
strip  are  more  accurate  than  those  reported  by  others.  The  high  frequency  propagation  constants 
are  also  accurately  computed,  as  demonstrated  by  the  microstrip  with  isotropic  or  anisotropic  ma¬ 
terials,  and  by  a  rectangular  inhomogeneous  waveguide. 
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A  Scattering  Analysis  of  Laser  Bearn  Wave  by  Groove'  Pits 
on  Optical  Memory  Disk  by  Using  FEM  with  BEM 

Yasumitsu  MIYAZAKI  and  Keiji  TANAKA 

Departmf'nt  of  Iti formation  and  Computer  Sciences,  Toyohashi  University  of  technology, 
1-1,  Hibarigaoka,  Teinpakn-cho,  Toyohaslii-shi,  Aiclii,  441  .Japan 

Abstract  —  Numerical  solutions  of  electromagnetic  fields  arc  presented  for  scattering 
by  guide-grooves  and  recording  marks  on  re-writablc  type  phase-cliange  optical  disks  with 
multi-layered  structure.  The  finite  element  method  with  boundary  element  method  can  be 
applied  to  scattering  analysis  of  various  groove  forms  and  inhomogeneous  recording  m.arks 
under  different  conditions  of  the  incident  beam.  Near  and  scattered  far  fields  characteristics 
from  guide-groo\'es  having  trapezoidal  shapes  in  cross  section  are  studied  for  different  po¬ 
larization  of  incident  beam  wave.  Characteristics  of  read-out  signals  are  akso  calculated  as 
functions  of  groove  height  and  fitm-thickness  of  multi-layer.  As  a  result  of  numerical  anal¬ 
ysis,  the  groove  height  to  obtain  maximum  tracking  sensitivity  ami  optimum  filrn-thickness 
of  multi-layer  are  found. 


1  Introduction 

Recently,  o[)tical  memory  disks  able  to  store  a  large 
information  have  been  u.sed  as  video  disks,  digital  au¬ 
dio  disks,  CD-ROM  and  so  on.  Memory  signals  arc 
recorded  on  disk  substrate  in  the  form  of  pits  or  record¬ 
ing  marks,  which  have  dimensions  of  the  order  of  optical 
wavelength  Aq.  Exact  evaluation  of  the  scattering  char¬ 
acteristics  is  important  to  optimize  the  shape  of  pits, 
guide-grooves  and  recording  marks,  in  order  to  increa.se 
the  memory  density  and  capacity. 

The  optical  disks  are  classified  into  several  cat¬ 
egories  of  read-only  type,  write-oricc  tytxg  and  re¬ 
writable  type.  Scattering  cliaracteristics  of  read-only 
disk  pits  and  guide-grooves  liave  been  studied  using  the 
diffraction  theory  in  a  scalar  field[l],  tin?  diffraction  the¬ 
ory  in  a  vector  field[2],  and  precise  electrf)magnetic  fiehi 
analysis[3].  According  to  diffraction  theory  in  a  scalar 
field,  where  the  size  of  a  .scattering  body  is  of  the  same 
order  or  less  than  the  wavelength  Aq,  an  exact  solution 
cannot  be  obtained.  In  the  diffraction  theory  of  vector 
field  or  the  precise  field  analysis,  on  the  other  liand,  the 
analysis  is  more  complicated  and  imposes  restrictions 
on  the  shapes  of  pits  and  guide-grooves  and  the  incident 
beam.  We  used  Ihe  boundary  element  method  (BEM) 
to  apply  several  conditions  and  obtained  satisfactory 
re.su!(s[4]-[6]. 

In  the  scattering  analysis  of  laser  beam  wave  from 
pits  and  grooves  on  the  o])tical  disks  of  write-once  type 
and  re-writahlc  tyjte,  it  is  iieces.sary  to  analyze  the  op¬ 
tical  scattering  characteristics  of  pits  and  grooves  on 


the  houtidaries  of  mull i-Iayered  dielectric  medium.  For 
rc-writable  type  of  phase-change  optica!  disk,  memory 
signals  are  recorded  on  the  recording  layer.  The  record¬ 
ing  layer  chatiges  from  crystalize<l  foi  tii  before  recoi  ding 
into  non-crystalized  one  afler  recording,  and  it  yields 
inhomogeneous  region  of  refraclive  index,  called  the 
“Recording  mark”.  The  scattering  field  analysis  by  the 
finite  element  method  (FE.M)  with  BEM  yields  exact 
characteristics. 

In  the  case  of  the  re-writable  type  optical  disk,  the 
FEM  ran  be  applied  to  the  analysis  of  the  inhomoge¬ 
neous  region  of  the  rnulti-layer,  and  the  BEM  can  be  ap¬ 
plied  to  the  homogeneous  region  of  the  poly-carbonate 
(PC)  substrate,  which  is  an  open  region.  The  fitdds 
rejiresenled  by  BEM  and  FE.M  are  matched  on  tin' 
boiindaiy  lietvveen  the  PC  suhsti'ate  and  multi-layer. 
The  fields  oii  the  boundary  between  multi-layer  and  re¬ 
flection  layer  satisfy  tlie  surface  im[»'dance  honndaiy 
condition.  In  tliis  paper,  scattering  cliaracteristics  by 
r<*-writ;i.ble  t.ype  groove  jiits  are  discussed  for  TIC-wave 
and  TM-wave.  In  this  paper,  we  have  presented  the 
formulation  using  the  FEM  with  BiCM  in  section  2  and 
shown  some  results  in  section  3. 

2  Formulation 

Figure  1  show.s  .sectional  view  across  llie  radius  of 
pha.se-change  optical  disk.  There  ar<'  recording  marks 
in  the  recording  layer  over  tlu*  guidc-gi-oove.  .Memory 
signals  are  detected  by  piioto  detectors  as  a  change  of 
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Table  1;  Analysis  parameter  of  pliase-changc  optical  disk  (Ao-0.83[y/m]) 


Region 

Structure  of  Film 

Refractive  Index 

Filrn-thickness 

Substrate 

PC 

111  =  1-58 

Infinity 

Upper  protective  Layer 

ZnS-SiOi 

»21  =  2.1 

di=:140nm 

S2 

Memory  Layer 

GeSbTeSe 

ri22  —  5T  -  j2.4 
n,  =  4.5-jT.O 

dv=40nm 

Lower  protective  Layer 

ZnS-SiO'i 

7123  =  2.1 

c/3  =  200  rim 

S3 

Reflection  Layer 

Au 

h-i  =  0.2  -  j5.0 

Infinity 

scattered  field  intensities.  Actual  phase-  change  optical 
disks  have  three  dimensional  structure.  However,  for 
simplification,  the  scattering  analysis  model  regards  a 
two-dimensional  structure.  Figure  2  shows  the  scatter¬ 
ing  analysis  model  for  phase-change  optical  disk  with 
coordinates  for  the  trapezoidal  guide-grooves  and  the 
incident  beam.  Let  the  coordinate  system  for  tlie  guide- 
grooves  be  with  an  origin  O  and  that  for  inci¬ 

dent  beam  be  (x,y',z')  with  an  origin  O'.  The  phase- 
change  optical  disk  is  consist  of  a  reflection  layer  re¬ 
gion  S3,  a  multi-layer  region  S2  and  a  substrate  re¬ 
gion  S,  formed  poly-carbonate,  The  multi-layer  region 
S2  consists  of  protective  and  recording  layers.  There 
is  a  boundary  IT  between  the  region  5i  and  S-j,  and 
a  boundary  IT  between  the  region  S2  and  S3.  Ihe 
boundaries  IT  and  F-j  have  the  shape  of  trapezoidal 
guide-groove.  The  sliajie  of  trapezoidal  guide-groove 
on  the  reflection  layer  has  an  up|ier  width  2uj(j,  lower 
width  recording  mark  width  '2w,.,  height  h  and 

track  pitch  h.  T'he  shape  of  guide  groove  is  similar  t  o  the 
boundary  IT  and  F^.  A  recording  layer  of  GeSb'lcSe 


Figure  1:  A  radial  section  of  phase- change  optical  disk 


film[7]  is  formed  between  dielectric  layers  made  of  ZnS- 
Si02,  and  an  Au  reflective  layer  is  set  below  them. 
The  values  of  refractive  index  and  the  film  thickness  in 
each  layers  used  for  the  analysis  are  shown  in  Table  1. 
The  refractive  index  in  recording  mark  changes  n-js  into 
In  our  formulation,  the  region  5,  whicli  is  lioino- 
geneous  and  unclosed  region  applies  to  the  REM.  And 
inhomogeneous  region  S2  applies  to  the  FEM.  Then  the 
region  S3  which  is  a  good  conduct.or  applies  to  surface 
impedance  approximation  mol  hod. 

The  incident  beam  <p’J'  is  two-dimensional  Gaus¬ 
sian  beam  of 'FE-wave  {having  only  the  x  component 
of  electric  field  E)  or  TM-wave  (having  only  the  a'  com¬ 
ponent  of  magnetic  field  H).  The  beam  waist  wq  is  lo¬ 
cated  on  y'  axis;  yo  represents  the  tracking  error;  and 
represents  the  angle  of  incidence.  The  focus  point 
(yoi'o)  of  the  incident  beam  sets  a  point  0/  on  tlie 
recording  layer. 

The  incident  beam  for  d'E-wave  and  TM-wave 


Figure  2:  Analysis  model  for  phase-change  optical  disk. 


1063 


can  be  represented  by 


Eu 

Ih 


uiy\E), 


where 


(1) 


(2a) 


(2b) 

(2c) 


U'o  is  the  spot  size  at  the  beam  waist  and  ki  = 
is  propagation  constant  in  the  region  iSi.  The  time 
factor  cxp{ juit)  is  omitted  from  our  formulations. 

The  field  4>i  in  tlie  region  5i  and  the  field  (p-2  in  tlie 
region  St  sati.sfy  the  following  Heltnhoiz’s  equations, 
respectively; 

V'01  -fA'l'flii  ~  —gi  in  ii),  (3) 

V  •  -  0  5.,,  (d) 

where  j/i  is  a  wave  source  in  the  region  Si  -  p  and  q  are 
given  by 


Pn  = 


1 

?„/fn 


.'?n 


£„/£:()  (TE-wave) 
1  (d'M-vvave) 


where  5„/eQ(r)  =  1,2,3)  is  ratio  of  permittivity  in  eacli 
region  and  A’q  =  w/c  is  the  free  space  propagation  con¬ 
stant  with  c.  the  light  velocity. 

The  bourulary  conditions  on  Ei  are  given  by 


Si  =  So, 

(6a) 

1  ^  _ 

1  862 

(6b) 

Pi  On 

po  du  ' 

w'here  djdn  represents  the  differential  of  the  inward 
normal  direction  to  the  region  S^- 

The  boundary  condition  on  F-j  for  TE-wave  and 
TM-wave  can  he  rcfiresented  by  the  surface  impedance 
boundary  condition; 


P2  on  \  JulSoZrn  ) 
where 


Figure  3:  Integral  of  each  element 

2.1  Boundary  element  analysis  in  the  region  S\ 

For  the  boundary  element  analysis,  the  weight  function 
is  the  Green  function  in  the  two-dimensional  free  space 
as  follows; 

</)•  =  -^//'-’(Ge).  (9) 

where  is  the  zero-th  order  Hankel  function  of  the 
second  kind.  When  the  boundary  element  method  witli 
weighted  residual  procedure  is  applied  to  E(|,(3),  the 
following  equation  is  obtaiiied;- 

/  (10) 

Jr,  on  on 

When  point  i  is  on  the  boundary  Fi,  the  following 
boundary  integral  equation  is  obtained:- 

Q4>u  +  -f  ^d>idF-/  (11) 

Jr,  Jr,  On 

where  C,  is  determined  by  the  angle  0,  which  is  defined 
in  figure  3,  and  C,  =  0i/2K  is  obtained,  -j-  represents 
the  Cauchy  principal  value  integration.  By  dividing 
Eq.(ll)  into  boundary  elements  and  expanding 

the  field  (pi  and  Ocpi/dn  using  the  interpolation  func¬ 
tions  fj{j  =  1,2,  ...  ,  jV„m),  the  discrete  equation  with 
respect  to  point  i  on  tlie  boundary  Fi  is  obtained,  (pi 
and  d(p/dn  at  all  the  nodal  points  are  given  by  the  ma¬ 
trix  re[)resentation, 

=  i«').  (12) 

where  matrices  [//]  and  [6’]  represent  square  matrices 
of  X  Nntii-  {d>i},  {0<p\/0n}  and  { } represent 
vectors  of  Nnh\  x  1 


is  the  surface  impedance  of  the  region  Ss,  because  5;} 
is  good  conductor. 


2.2  Finite  element  method  in  tbe  region  So 

For  the  finite  element  analysis,  the  region  So  is  di¬ 
vided  into  i\'fj  triangular  elements.  We  apply  the 
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Calerkin  method  to  obtain  the  equation  about  an  ele¬ 
ment  ‘(e)’,  by  multiplying  Eq.{4)  by  the  weight  func¬ 
tion /j(j  =  1,2, ,  Nnj),  and  integrating  over  the  area 
5-j(e)of  a  triangular  element  ‘(e)’  <u;  shown  in  figure  3. 
The  following  equation  is  obtained  by  summation  of  all 
elements: 


-T  I  = 


0.  (13) 


where  L.  represents  an  integration  along  the  con- 
tour  C(e)  and  ?i(e)  represents  a  unit  vector  of  inward 
normal  direction  to  the  region  52(e)  as  shown  in  figure 
3  Eq.(13)  reduces  to  the  following  equation  due  to  the 
boundary  and  continuity  conditions  between  neighbor¬ 
ing  elements: 


[  -V/;  -V02)d5 

JSj  P2 

1  d4>2 


i 


f^-^dr  =  Q. 

r.+Pj  P2 


(14) 


The  field  (p2  in  tlie  region  52  is  represented  by  expand¬ 
ing  using  the  interpolation  function  which  is  simi¬ 
lar  to  weight  function  fj.  The  differential  d4‘2/0n  along 
normal  direction  on  the  boundary  T i  and  r2,is  also  rep¬ 
resented  similarly.  Therefore  (p-i  and  1^'^ 

nodal  points  in  the  region  S->  are  given  by  the  matrix 
representation, 


where  [/•f]  represents  a  square  matrix  of  A'„/  x  N„j, 
[M]  represents  a  matrix  of  N„j  x  (A^,,;,!  +  Also 

{<f)2)  and  {d(f>2/dii}  represents  vectors  of  Nnj  x  1  and 
(A'nM  -I-  A^„i,2)  X  1,  respectively. 


2.3  Combination  of  FEM  and  BEM 

By  substituting  the  boundary  conditions  Eq.(6a)  and 
Eq.(6b)  at  the  boundary  Fi.  and  the  boundary  condi¬ 
tion  Eq.(7)  at  the  boundary  r2  into  Eq.(1.5),  the  field 
represented  by  BEM  and  FEM  are  matched  on  the 
boundary  Fi.  <f)2  and  cf<;p>2/c*u  at  each  nodal  point  are 
obtained  by  solving  matrix  equations  of  the  FEM  and 
BEM.  The  scattered  field  at  an  arbitrary  point  i 
in  the  region  S'l  is  given  by 


where  Hij  is  the  element  in  the  matrix  [f/]  when  the 
value  Ci  for  its  diagonal  element  is  put  to  zero.  The 
scattered  field  intensity  P,{9)  is  defined  by 


P5(<l)  =  20  1ogi, 


\rnax\<j>^o'^ip,0)\J  ’ 


(17) 


where  (f>Q''{p,0)  represents  the  scattered  far  field  from 
the  perfectly  conducting  plane  in  the  PC  substrate 
(n,  =  1.58). 

If  both  the  incident  beam  and  the  groove  shape  in 
figure  2  are  symmetrical  to  rr-axis,  the  analysis  region 
can  be  reduced  to  half  tliat  of  each  region.  Although 
the  region  S2  is  an  infinite  one,  this  is  truncated  at  a 
point  where  equivalent  surface  current  is  sufficiently  at¬ 
tenuated.  We  have  confirmed  that  numerical  solutions 
obtained  by  the  present  formulation  agree  well  with  an¬ 
other  method[8|. 


3  Numerical  Results 

The  constant  parameters  used  in  the  calculations  are; 
incident  wavelength  Aq  =  0.83/iin.  NA  (Numerical 
Aperture  of  objective  lens)  =  0.5.  Wd  —  0.4 1 A d/ A. 4, 
p  =  lOOOAo/ni,  uni  =  u>l  = 

6  =  1.6;im.  The  refractive  index  and  film-thickness  of 
each  layer  show  in  Table  1.  These  parameters  are  used 
in  the  calculations  for  the  following  numerical  exam¬ 
ples  unless  we  pay  attention  to  the  parameter.  There 
are  three  guide-grooves  in  the  analysis  region.  1  he 
recording  layer  is  homogeneous  before  recording,  but 
that  changes  to  inhomogeneous  after  recording.  This 
is  because  tlie  recording  signal  is  marked  on  record¬ 
ing  layer  over  center  guide-groove  after  recording  by 
the  writing  laser  beam.  We  call  the  part  of  marking 
‘Recording  Mark’. 

Figure  4  shows  allocation  of  triangular  element  in 
region  52  with  symmetry  about  ;-axis.  The  region  52 
is  divided  into  A’,,/  structured  triangular  elenreiits.  The 
boundary  Fi  and  r2  are  divided  into  A'^m  and  i\\y>  lin¬ 
ear  elements,  respectively.  The  length  of  linear  clement 
and  the  edge  of  triangular  element  are  less  than  1/10 
of  the  wavelength  A,,  (n=l,2,3)  in  each  region.  As  a 
result  of  dividing  the  elements  in  each  region,  the  fi¬ 
nite  element  mesh  consist  of  252l)(  =  Ay/)  triangular  el¬ 
ements  with  1377(=A^i/)  nodes.  There  are  81(  — A'„f,i) 
nodes  on  the  boundary  Fj  and  97{=A,ii,2)  nodes  on 
the  boundary  r2.  Taking  advantage  of  the  property  of 
band  and  sparse  of  [A]  matrix,  the  memory  size  of  the 
computer  can  be  small. 
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Figure  4:  Allocation  of  triangular  elements 

3.1  Characteristic.s  of  Scattered  Far  Fields 

Figure  5  shows  the  changes  of  tlie  scattering  far  field 
jiatterns  before  and  after  recording  for  TE-wave  and 
TM-wavc.  wliere  HEF-RFC  and  AFT-REC  in  tl.e 
figure  represent  the  state  befor*?  rect)rding  and  after 
recording,  respectively.  The  scattered  far  field  patterns 
change  widely  in  the  range  (7  =  60  ~  90°.  The  main 
lobes  fall  to  a  low  value  after  recorditig  both  polariza¬ 
tions.  The  scattered  far  field  at  0  =  90“  after  recording 
is  lower  by  about  6dB  than  that  before  recording.  The 
side  lobe  at  about  70“  appears  after  recording.  The 
dip  between  main  and  side  lobes  for  TM-wave  is  larger 
than  that  for  TE-wave.  So  the  difference  before  and 
after  recording  for  I’M-wave  is  larger  than  ll)at  for  TE- 
wave  . 

Figure  6  shows  the  changes  of  the  scattered  far 
field  patterns  as  a  fiitiction  of  tracking  error  yn  for  TE- 
wave  and  TM-wave  before  recording.  As  the  tracking 
error  yo  is  large,  tlie  scattered  far  field  in  the  range  of 
(7  —  0  ~  90“  is  large  in  comparison  with  (7  =  90  ~ 
180°.  When  yo  is  0.4/jm,  there  is  a  deep  dip  at  about 
100°.  Then  the  difference  of  scattered  field  between 
right  and  left  half  planes  is  maximum.  Whet)  yo  is 
greater  than  0.4//m,  the  scattered  far  field  intensity  at 
0  =  90“  becomes  much  larger  and  the  scattering  far 
field  patterns  become  symmetric. 

3.2  Characteristics  of  Near  Fit'lds 

The  scattered  far  field  and  tlie  field  in  the  multi-layer 
are  relative  to  each  other.  Therefore,  we  have  ana¬ 
lyzed  fields  near  the  center  guide-groove.  Fields  arc 
composed  of  scattered  fields  and  incident  fields.  First 
figure  7  and  figure  8  show  distributions  of  electric  fields 


90‘ 


- BEF-flEC 

- AFTTtC 


90- 


-BEF-flEC 

-AFT-REC 


Figure  b;  Scattered  field  patterns  by  phase-ebange  opti¬ 
cal  disk,  for  {(Jir  =  w,-  =  ico.  let  =  h  =  Ao/Sni. 

b  =  1.6//m,  yo  =  0,  p  =  lOOOAo/n). 


and  magnetic  fields  for  TE-wave  and  TM-wave,  where 
figure  (a)  and  (b)  represent  it  Itefore  recording  and  af¬ 
ter  recording,  resfiectively.  Tor  both  the  polarization, 
there  are  standing  waves  over  the  boundary  I'l,  and 
the  standing  waves  ratio  before  recording  are  bigger 
than  that  after  recording,  hi  the  multi-layer  region  S-), 
the  electric  fields  in  lower  protective  layer  after  record¬ 
ing  for  TE-wave  are  greater  than  th;it  before  record¬ 
ing.  Also  the  magnetic  fields  for  'FM-wave  are  similar, 
exce(>t  tliat  the  magnetic  fields  are  great  on  the  con¬ 
ductive  medium,  These  phenomena  are  considered  res¬ 
onant  in  lower  protective  layer  Itetween  recording  and 
reflection  layers.  The  dilfercnce  of  scattered  field  before 
and  after  recording  is  that  the  film-thickness  for  re.so- 
nance  is  changed  by  the  refractive  index  of  recording 
layer.  This  difference  affects  scattered  far  fields.  The 
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-  yO=0.0ui 


Ps(0l  [dB] 

Figure  6:  Scattered  field  patterns  with  tracking  errors 
(TE-wave),  for  wit  =  nv  =  it'n,  fi  = 

Ao/8ni,  6  =  1.6/im, /?  =  lOOOAn/ni . 

main  lobes  of  scattered  far  field  patterns  largely  depend 
on  the  film-thickness  of  lower  protective  layer.  And 
read-out  signal  I,  is  maximum  when  the  film-thickticss 
da  in  lower  protective  layer  is  a  suitable  value. 

Near  field  intensity  and  corresponding  equivalent 
induced  currents  are  important  for  evaluation  of  scat¬ 
tering  by  inhornogencitics  and  index  changes.  Hgure 
9  shows  equivalent  induced  currents  corresponding  to 
field  intensities  at  the  boundary  discontinuilies  (pi  and 
Ji  -  d4)ildn. 

3.3  Characteristics  of  Read-Out  Signals 

Tlie  wave  to  be  scattered  by  the  recording  mark  and 
guide-grooves  on  the  optical  disk  are  detected  by  two 
split  photo  detector  through  an  objective  lens.  Here 
we  define  the  read-out  signal  h  and  the  tracking  error 
signal  Id-  h  is  sum  of  output  signal  of  both  two  de¬ 
tectors.  Id  is  difference  output  signal  of  two  detectors. 
The  read-out  signal  and  tracking  signal  are  given  by 


‘■-ij 

(18) 

1  \(j)Y\-d0-J  , 

(19) 

respectively.  Where  is  the  distribution  of  scattered 
far  fields,  a  =  siir^(AM/ni)  is  the  angle  of  aperture,  /o 
represents  read-out  signal  by  the  perfectly  conductive 
plane  in  the  PC  substrate  (n^  =  l.hS).  Now  we  define 
read-out  signal  before  recording  and 

after  recording  and  amplitude  of  read-out.  signal  l^I,  — 

hiBEF)  - 

Figure  10  shows  read-out  signal  and  maximum 
tracking  error  signal  Limas  versus  the  guide-groove 


-800  -600  -400  -200  0 

y  [nm] 

(a)  before  recording 


•800  -600  -400  -200  0 

ylnm] 

(b)  after  recording 


Figure  7;  Distributions  of  electric  fields  near  a  center 
guide-groove  (TE-wave),  for  wu  =  w,-  =  wu,  ICL  = 
1.2wa,  /i  =  Ao/8n,,  6  =  l.fipm,  yn  =  0. 

height  h  before  recording.  Read-out  signal  /,  repre¬ 
sents  a  value  in  the  case  of  i/u  =  0,  JOid  maximum 
tracking  error  signal  Idmar  reiiresents  a  value  in  the 
case  of  yo  =  O.dpm.  In  the  cfise  of  />  =  0,  I,  is  max¬ 
imum  and  Idmar  is  about  zero.  For  TE-wave,  Idmar 
is  maximum  at  a  height  of  h  =  70nm.  For  TM-wave, 
Idmar  is  maximum  at  a  height  of  h  =  60nm.  1  hese 

value  of  h  are  about  Ao/Sru.  Then  fi  is  an  optimum 
height  to  obtain  maximum  tracking  sensitivity  for  each 
polarizations.  When  h  is  lower  than  Ao/8ni,  read-out 
signal  I,  keeps  half  intensity  in  comparison  with  read¬ 
out  signal  of  /i  =  0. 

Figure  11  shows  read-out  signal  /,  versus  the  up¬ 
per  width  (uu/ieo  of  guide-groove  for  TE-wave  before 
recording,  where  lower  widt.h  wi  —  U.'iu'o  +  n-'o  •  As 
the  upper  width  wij/ivo  approaches  0.5,  />  de(  leases. 
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-eoo  -600  -400  -200  0 

y[nm] 

(b)  after  recording 


Figure  8:  Distribution  of  magnetic  fields  near  a  center 
guide  groove(TM-wave),  for  =  ti>,.  =  in(j,  = 
1.2tefr,  h  =  Ao/8n.],  h  —  ijo  =  0. 

Wiien  h  is  lower  than  Ao/Srri.  tliere  is  not  very  much 
change  of  /,  by  u'i'/iV{). 

Figure  12  sliows  read-out  signal  f,  and  amplitude 
of  read-out  signal  A/,  versus  the  film-thickness  in  eacli 
layer  for  TE-wave,  where  figures  (a),  (b)  and  (c)  show 
the  dependences  on  film-thickness  of  upper  protective 
layer,  recording  layer  and  lower  protective  layer.  For 
upper  protective  layer,  the  film-thickness  doesn’t  affect 
read-out  signal  /, .  The  skin  deptli  of  recording  layer 
is  0.043Ao  before  recording  and  O.OTbAo  after  record¬ 
ing.  Hence,  when  the  film-thickness  do  of  recording 
layer  is  thinner  than  the  skin  depth,  read-out  signal  /, 
changes  very  much.  For  resonance,  I,  depends  on  the 
film-thickness  d-^  of  lower  protective  layer.  When  the 
amplitude  of  read-out  signal  A/,  is  maximum,  the  film- 
thicknesses  in  each  layer  are  c/|  =  18()nm,  do  =  dOnm 
and  c/;j  =  200um.  Now  tiie  maximum  amplitude  of 
read-out  signal  changes  by  about  2i  %. 


y  [eml 

Figure  9:  Magnitude  of  equivalent  induced  currents 
intensity  on  the  boundary  F]  and  Fv  (TE-wave),  for 
wy  =  ti’r  =  u.'o,  u'/,  =  1.2u;!_;,  />  =  Ao/8?ii,  6  —  l.fi/nn, 

yo  =  0. 


Groove  height  h  [nm] 

Figure  10:  Dependence  of  read-out  signal  I,  and  track¬ 
ing  error  signal/j  on  the  groove  height  h,  for  il>u  =  w,-  = 
Wq,  u’i  —  1.2K'r;,  h  =  l.Gprn,  ,(/n  —  {},  p  —  lOOOAo/rti. 

4  Conclusion 

The  scattering  characteristics  of  laser-beam  waves  on  a 
pha.se-change  optical  disk  menior}'  were  numerically  an¬ 
alyzed  by  using  the  finite  clement,  method  with  bound¬ 
ary  clement  method  which  can  be  applied  to  various 
conditions.  We  obtained  some  results  as  follows: 

(1)  The  main  lobes  of  the  scattering  i)atterns  fall  to  a 
low  value  after  recording  both  polarizations.  The 
scattering  far  field  at  0  —  90°  after  recording  is 
lower  by  about  fidD  in  comparison  with  that  !)efore 
recording.  The  side  lobes  at  about  TO”  appear  after 
recording. 

(2)  The  scattering  far  fields  largely  depend  on  reso¬ 
nance  in  lowf'r  protective  layer, 

(3)  For  TE-wave,  Idmar  is  inaximnm  at  a  height  of 

h  =  70nm.  For  TM-wave,  is  maximum  at 

a  height  of  li  —  (iOnm.  Ttiese  values  are  about 
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Figure  11:  Dependence  of  read-out  signal  1,  on  the 
upper  width  of  guide-groove  tvv/wo  ,  for  =  0,‘2u)q  + 
wc/,  b  =  1.6/<rn,  yo  —  0,  p  =  lOOOAo/ni. 

Ao/8ni.  When  h  is  lower  than  Aq/S?!],  read-out 
signal  I,  keeps  half  intensity  in  comparison  with 
read-out  signal  of  at  a  height  of  /i  =  Onni. 

(4)  As  the  upper  width  ivy I Wq  approaches  0.5,  1,  de¬ 
creases.  When  h  is  lower  than  An/Sni,  there  is  not 
very  much  change  of  1,  with  wv /wq. 

(5)  When  the  film- thick  nesses  in  each  layer  are  = 
180nm,  d->  ^  40nm  and  -  2()0nm,  the  maximum 
amplitude  of  read-out  signal  A/,  changes  by  aboiit 
21  %. 

These  results  give  important  field  characteristics  to  ob¬ 
tain  optimum  design  of  phase-change  optical  disk. 
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(a)  upper  protective  layer 


(b)  recording  layer 


(c)  lower  protective  layer 

Figure  12:  Dependence  of  recid-out  signal  I,  on  each 
film-thickness,  for  wo  =  «V  =  u’o.  u'i-  =  I  'Jirc/.  h  = 
Ao/8ni,  6  =  1.6/mi,yo  =  0. 
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ABSTRACT 

We  present  3D  finite  element  formulations  for  the  modeling  of  unbounded  microwave  problems. 
Formulations  arc  directly  written  in  terms  of  vector  fields.  The  open  boundary'  is  modeled  using 
Engquist-Majda  absorbing  boundary  condition.  Two  types  of  finite  elements  are  compared:  nodal- 
based  and  mixed-based. 


INTRODUCTION 

The  finite  element  (E.E.)  method  is  an  efficient  way  for  solving  open  boundary  frequency  domain 
microwave  problems,  such  as  scattering  or  antenna  radiation.  The  coupling  with  absorbing 
boundary  condition  (A.B.C.)  allows  to  truncate  the  E.E.  domain  to  finite  size  by  absorbing  the 
outgoing  wave.  If  global  A.B.C.  are  exact  by  nature,  they  how’ever  cannot  be  used  in  a  3D  context, 
because  they  generate  a  full  matrix  on  the  boundary.  On  an  other  hand,  local  A.B.C.  are  based  on 
an  approximation,  but  they  preserve  the  sparsity  of  the  E.E.  matrix.  That  is  the  reason  why  we  have 
chosen  a  local  A.B.C.  for  our  formulations.  Eurthermore,  we  prefer  to  work  with  a  rectangular 
outer  boundary,  so  we  use  the  Engquist-Majda  A.B.C. 

This  paper  deals  with  the  modeling  of  unbounded  microwave  problems  (computation  of  near  and 
far  field)  and  compares  two  types  of  finite  elements  used  for  the  numerical  discretization:  nodal- 
based  elements  and  ll(curl)  mixed  elements.  We  first  give  the  Galcrkin  form  of  the  vector  w'ave 
equation.  We  show  then  how  the  Eingquist-Majda  A.B.C.  is  written,  and  how  the  numerical 
discretization  is  performed  using  both  types  of  finite  elements.  Finally,  we  compare  both  finite 
elements  on  two  examples:  radiation  from  an  infinitesimal  dipole  and  scattering  by  a  perferct 
electric  conducting  cylinder. 


FINITE  ELEMENT  FORMS  OF  THE  VECTOR  WAVE  EQUATION 

By  crossing  both  curl  Maxwell's  equations  written  in  the  frequency  domain,  we  get  the  vector  wave 
equation  (for  the  magnetic  field  H  for  example): 

Vx[^^VxHj-ko^cH  =  -ja)eoJ  (1) 

Galerkin  forms. 

Applying  the  Galerkin  method  with  a  vector  weighting  function  W,  we  obtain: 

(e,  -  ■  V  X  H)  •  ( V  X  W )dQ  -  k «- BrH  •  WdQ  +  ' (n  x  V  x  H )  •  WdE  =  - JeoEoj^J  •  WdO  (2) 
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In  the  previous  expression,  the  surface  integral  is  used  for  the  implementation  of  the  local 
absorbing  boundary  condition,  through  the  approximate  tangential  operator  T: 

n  X  V  X  H  =  r(H)  on  the  external  boundary  (3) 

The  current  density  .1  is  the  source  used  for  the  computation  of  the  radiation  of  an  antenna,  such  as 
an  open  ended  waveguide  [1].  For  scattering  problems,  there  is  no  localized  source  J,  and  the 
vector  wave  equation  becomes  homogeneous  (the  right  hand  side  in  (1)  is  zero).  The  total  field  H  is 
separated  into  incident  field  H*  and  scattered  field  and  the  A.B.C.  is  written  only  for  the 
scattered  field.  This  leads  then  to  the  formulation  (called  total  field  formulation): 

f  (e^-iVxH)  (VxW)dQ-ko2f  UrH  WdQ  +  l  e,-'r(H)-Wdr- 

JO  ^4^ 

^^^£,-i{r(Hi)-nxVxH'}-Wdr 


Both  expressions  (2)  and  (4)  are  for  an  numerical  implementation  of  mixed  elements.  If  a  scalar 
weighting  function  N  is  used  now  to  get  the  Galerkin  form  (case  of  nodal-based  elements),  these 
expressions  become: 

-  for  an  antenna  problem: 

j^^(VN  xe^  -  'Vx  H)dTi  +  ko^j^^li.NHdQ  -  -  'N(nx  V  x  H)dr  =  -j(ne(,j^NJd<21  (5) 

-  for  a  scattering  problem: 

jjVNxe,-iVxH)dQ+ko2j^^!irNHda  -  |^_e,-'Nr(H)dr=-^^er'N{r(H')-nxVxH‘}dr(6) 

Similar  expressions  to  (2),  (4),  (5)  and  (6)  may  be  obtained  for  the  electric  field  E,  by  just  replacing 
H  by  E  and  crossing  Zy  and  [ly.  For  scattering  problems  -(4)  and  (6)-,  incident  field  has  to  evaluated 
only  on  the  externa!  boundary  F. 


Boundary  conditions. 

In  microwave  engineering,  good  conductors  are  supposed  to  be  lossless,  because  of  the  value  of  the 
frequency  (at  1  GMz,  the  skin  depth  inside  the  aluminium  is  about  2.6  |im).  There  is  no  field  inside 
and  they  are  assumed  to  be  perfect:  they  are  not  meshed  and  the  boundary  F  in  (2),  (4),  (5)  and  (6) 
is  made  of  the  external  boundary'  and  of  the  boundaries  of  the  conductors.  On  the  conductors,  we 
have  the  boundary  condition  for  the  electric  field: 

nx  E  =  nx(ja)e)"  ‘  V  xH  =  0  (^7) 

In  the  H-formulations,  the  surface  term  in  (2),  (4),  (5)  and  (6)  cancels  then  out  on  the  boundaries  of 
the  conductors,  and  the  boundary  condition  is  implicit.  When  these  expressions  are  written  for  the 
electric  field  E,  the  boundary  condition  can  no  longer  be  implicit  and  has  to  be  explicitly  enforced 
in  the  global  system  matrix. 
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NUMERICAL  IMPLEMENTATION 


Nodal-based  finite  elements. 

Although  they  seem  more  suitable  for  potential  problems,  nodal  elements  may  be  used  for  the 
discretization  of  vector  field  problems:  they  are  easy  to  implement,  they  conform  exactly  to  curved 
surfaces  and  they  minimize  the  time  of  computation.  Furthermore,  spurious  modes  may  be 
eliminated  using  the  penalty  function  -V(V.|irH):  this  makes  the  formulation  equivalent  to  a 
laplacian  one,  and  decouples  the  coordinates  of  the  nodes  inside  the  media.  Including  this  penalty 
function,  (5)  becomes  [I]; 

j^{  VN  X  e,  -  I V  X  U)dQ  +  ko^  j^p.NHdQ-  j^^{V.p,H)(VN)dQ  -  ^  e,  ~  iN(n  x  V  x  H)dr 

r  r  (8) 

+1^  N(  V.  p,H)dr  =  -  jcD8„  I^NJdQ 

There  arc  3  complex  unknowns  per  node,  corresponding  to  the  three  components  of  the  vector 
field.  Because  of  the  penalty  surface  integral,  the  resulting  matrix  is  not  symmetric. 

According  to  the  classical  finite  element  analysis,  we  use  first  order  hexahedrals  and  each 
component  of  the  field  H  is  interpolated  by: 

nodes  t  ^ 

withNi=-^P[(‘±“')  (9) 

.  =  1  8 

Note  that,  with  nodal  elements,  taking  into  account  the  normal  discontinuity  of  the  fields  at 
material  interfaces  is  not  easy.  But  it  is  possible  to  do  it;  the  nodes  at  interfaces  have  first  to  be 
decoupled,  and  the  boundar\'  condition  has  then  to  be  explicitely  enforced  inside  the  system  matrix. 


Mixed-based  finite  elements. 

Mixed  elements  seem  more  physical  and  more  suitable  for  the  modeling  of  fields:  they  only  enforce 
the  tangential  continuity  of  the  fields  and  allow  the  normal  discontinuity  at  materials  interfaces. 
Furthermore,  this  floating  of  normal  continuity  allows  these  elements  to  handle  objects  with  sharp 
edges. 

Numerical  discretization  is  performed  using  mixed  elements  conforming  in  space  //(curl)  [2]:  in 
particular,  1st  order  R1  elements  on  hexahedral  have  been  implemented.  There  is  only  one 
unknown  per  edge  and  the  unknown  vector  field  H  is  expanded  as; 

H  =  XHiW„  withH,  =  o,(H)  (10) 

i-[ 

where  N  is  the  total  number  of  degrees  of  freedom  associated  with  the  mesh.  The  vector  shape 
function  Wj  associated  with  the  edge  C={a,„,an}  is  given  by  [3]: 

Wi  =  Nn.VL„-NnVU  (11) 
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and  has  the  degree  of  freedom  o,: 


ai(p)=  f  p  Xmnds 

JC 


(12) 


where:  -  L„,  and  L„  are  first  order  nodal-based  functions  associated  with  the  edge  C={am,an}, 

-  N|  are  Lagrange  shape  functions  of  first  order  within  hexahedral, 

-  is  a  unit  tangent  vector  to  the  edge  C  and  orientates  it. 

In  this  form,  these  elements  appear  clearly  to  be  an  extansion  of  Whitney  edge  elements  in 
hexahedral. 


3D  ENGQUIST-MAJDA  A.B.C. 

Because  each  component  of  H  is  solution  of  the  scalar  Helmholtz  equation  and  satisfies  the 
Sommerfeld  condition,  it  follows  that  each  component  of  H  may  be  approximated  by  the  2D 
Engquist-Majda  A.B.C.  on  the  rectangular  outer  boundary.  Hence  we  can  derivate  this  following 
3D  vector  A.B.C.: 


nxVxH  =  7’(H)- jkoH,+-^VrHt  +  V,(n  H)  (13) 

2ko 

This  A.B.C.  leads  to  a  non-symmetric  linear  system  matrix,  due  to  the  last  term  in  (13):  this  is  not 
important  when  using  nodal-based  elements,  because  the  matrix  is  already  non-symmetric. 
However,  when  using  mixed-based  elements,  an  alternative  symmetric  version  may  be  obtained 
using  some  simple  approximations  [3]:  we  just  state  that,  because  (n.H)  is  a  scalar  radiation  field,  it 
can  be  approximated  by  a  first  order  Engquist-Majda  A.B.C.,  which  means  (because  the  divergence 
of  H  is  null): 


V  H,  =  jko(n  •  H),  implying  that  Vt(V  •  H,)  =  jkoV,(n  ■  H)  (14) 


The  "symmetric"  3D  vector  Engquist-Majda  condition  is  then: 

nxVxH  =  7’(H)  =  jkoH,  +  ^V.-Ht--^V,(VH)  (15) 

2ko  k() 

Rigorously,  one  should  evaluate  line  integrals  comming  from  the  implementation  of  the  A.B.C.  in 
the  Galerkin  form:  they  do  not  vanish  on  the  edges  of  the  external  rectangular  boundary,  and 
adequate  edge  and  corner  conditions  should  be  prescribed.  We  make  however  the  approximation 
that  parasite  waves  generated  by  these  geometrical  singularities  are  essentially  local  (this  is 
justified),  and  that  a  small  amount  of  them  propagates  toward  the  interior  of  the  domain. 


VALIDATION 

Radiation  by  an  infinitesimal  dipole. 

This  example  has  already  been  presented  in  [1]  and  [3]:  it  validates  the  F.E.  forms  coupled  with  the 
symmetric  and  non-symmetric  Engquist-Majda  A.B.C.  A  very  thin  current  element  of  short  length 


1073 


and  with  a  constant  current  is  positioned  symmetrically  at  the  origin  and  oriented  along  the  z  axis. 
The  problem  has  been  modeled  with  two  symmetries  (yz  and  zx  planes).  The  size  of  the 
computational  domain  is  IX  x  IX  x  2.1L  Each  finite  element  is  a  O.lX  x  O.R  x  O.R  brick.  This 
leads  to  a  mesh  made  of  2662  nodes.  The  frequency  is  3GHz. 

Fig.l  shows  that  the  computed  solution  removes  from  the  analytical  one  when  we  are  close  to  the 
edges  and  comers  of  the  F.E.  domain.  In  the  interior  of  the  domain,  both  solutions  are  close. 


Fig.l:  instantaneous  magnetic  field  in  the  yz  plane  at  x  =  O.R 
Left:  analytical  solution  -  right:  mixed  finite  elements  with  symmetric  A.B.C. 

This  is  confirmed  by  Fig. 2,  which  shows  also  that  the  accuracy  of  the  symmetric  Engquist-Majda 
condition  coupled  to  mixed-based  F.E.  is  approximately  the  same  as  the  non-symmetric  one 
coupled  to  either  nodal-  or  mixed-based  F.E. 


0  0,02  0,04  0,06  0,08  0,i 

distance  (m) 

Fig.  2:  instantaneous  magnetic  field  along  a  line  perpendicular  to  the  dipole, 
going  from  the  dipole  to  the  external  boundary. 

Comparison  of  nodal  F.E.  formulation  wdth  mixed  F.E.  formulation. 

Tab.!  compares  the  number  of  unknowns  and  computation  time  for  both  formulations.  Note  that, 
for  the  nodal-based  F.E.  formulation,  the  A.B.C.  leads  to  a  non-symmetric  system  matrix.  This  one 
is  however  approximately  symmetrized  by  adding  the  transposed  matrix,  and  solved  using  a 
symmetric  solver:  results  seem  quite  good. 

From  tab.  1,  it  can  be  seen  that  the  number  of  unknowTis  for  both  first  order  nodal  and  R1  mixed 
bricks  is  quite  the  same.  On  the  other  hand,  the  number  of  non-zero  terms  is  really  different  (nodal 
and  mixed  symmetric  have  to  be  compared):  mixed  elements  generate  much  less  non-zero  terms  (in 
a  1.6  ratio).  Elowever  CPU  time  is  more  important  with  mixed  elements,  which  means  that  our 
solver  is  really  not  adapted  to  this  type  of  elements. 
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formulation 

A.B.C. 

unknowns 

non-zero  terms 

CPU  time  (in  s) 

solver 

nodal 

non-svmmetric 

7986 

176919 

498 

conjugate  gradient 

mixed 

symmetric 

7381 

111301 

1028 

conjugate  gradient 

mixed 

non-symmetric 

7381 

215221 

18779 

gmres 

Tab.  1 :  comparison  of  nodal-  and  mixed-based  F.E.  formulations 
for  the  modeling  of  the  infinitesimal  dipole  (2662  nodes). 

NB:  the  system  matrix  resulting  from  the  nodal  formulation  is  approximately 
symmetrized  and  solved  with  a  symmetric  solver  (conjugate  gradient) 


Scattering  by  a  perfect  electric  conducting  cylinder. 

Both  formulations  (nodal-based  and  mixed-based  with  symmetric  A.B.C.)  are  now  compared  for 
the  case  of  the  scattering  by  a  perfect  electric  conducting  cylinder  of  various  length  (fig.3):  the 
radius  of  the  cylinder  is  0.6?i  and  its  length  (along  the  z  axis)  goes  from  0.6X  to  5X.  The  incident 
magnetic  field  has  only  one  component  along  the  z  axis,  and  propagates  in  the  direction  of  the 
positive  y.  Frequency  is  3  GHz. 


Fig.  3:  scattering  by  a  p.e.c.  cylinder  -  3GHZ  -  H^H^z  and  k'-k'y 
Instantaneous  magnetic  field  on  the  cylinder  and  on  the  external  box. 


Tab.  2  summarizes  the  results  in  term  of  unknowns,  number  of  non-zero  terms  and  CPU  time. 
Again,  for  a  given  problem,  the  number  of  unknowns  generated  is  identical  for  both  types  of  F.E., 
while  the  number  of  non-zero  terms  is  1.6  time  smaller  with  R1  mixed  elements.  We  compare  also 
in  Fig.  4  the  CPU  times  for  the  assembling  of  the  F.E.  system  matrix  (because  the  solver  is  not 
adapted  to  mixed  elements,  we  do  not  take  into  account  the  CPU  time  for  the  solving  of  the  system 
matrix). 
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length 

nodes 

1 

!  unknowns 

first  order  nodal 
non-zero  assembling  (in  s) 

unknowns 

Rl  mixed 

non-zero 

assembling 

0.6 

5109 

15327 

369354 

477 

14311 

218785 

472 

1.2 

6821 

20463 

485010 

685 

19167 

294345 

766 

1.8 

8319 

24957 

586209 

886 

23416 

360460 

1078 

2.3 

9817 

2945 1 

687408 

1106 

27665 

426575 

1430 

2.8 

11101 

33303 

774150 

1308 

31307 

483245 

1778 

3.3 

12599 

37797 

875349 

1572 

35556 

549360 

2291 

4.3 

15167 

45501 

1048833 

2035 

42840 

662700 

3129 

5.0 

17093 

51279 

1178946 

2427 

48303 

747705 

3890 

Tab.  2;  comparison  of  nodal-  and  mi.\ed-based  F.E.  formulations  for  the  modeling  of  p.e.c.  cylinders, 
length  is  given  in  wavelength  -  assembling  is  the  CPU  time  to  build  the  F.E.  system  matrix 


I  4(XX)- 

;  3(XX)i 

i 

P  2000! 

lu 

KXX)' 


8(X)0  10000  12000  I4(XX)  IfXXX)  18000; 

numberof  nodes 

♦  lirst  order  nodal  '  Rl  mised 


Fig.  4:  comparison  of  CPU  times  for  the  assembling  of  the  F.E.  system  matrix, 
depending  on  the  number  of  nodes  of  the  mesh. 


CONCLUSION 

We  have  compared  in  this  paper  two  types  of  finite  elements  used  for  the  modeling  of  open 
boundary'  microwave  problems.  If  the  accuracy  of  both  types  of  elements  is  comparable,  R1  mixed 
finite  elements  seem  more  suitable  for  field  problems,  because  the  discontinuities  of  the  fields  are 
easy  to  handle.  Next  step  w'ill  be  to  work  on  the  solver  itself,  in  order  to  get  acceptable  CPU  times. 


REFERENCES 

[1]  L.  Nicolas,  K.A.  Connor.  S..I.  Salon,  B.G.  Ruth,  L.F.  Libelo,  "Three  dimensional  finite  element 
analysis  of  high  power  microwave  devices",  IEEE  Trans,  on  Mag.,  vol.29,  pp. 1642-1 645,  mar.  93. 

[2]  J.C.  Nedelec,  "A  new  family  of  mixed  elements  in  R^",  Numer.  Math.  50,  pp.57-81,  1986. 

[3]  J.L.  Yao  Bi.  L.  Nicolas.  A.  Nicolas,  "H(curl)  elements  on  hexahedral  and  vector  A.B.C.'s  for 
unbounded  microwave  problems",  accepted  for  IEEE  Trans,  on  Mag.,  may  1995. 

[4]  L.  Nicolas,  "An  integral-type  approach  for  the  computation  of  the  far  field  radiated  by 
microwave  devices".  IEEE  Trans,  on  Mag.,  vol.30,  pp.3 124-3 127.  sept.  94. 


1076 


A  rationale  for  the  use  of  mixed-order  basis  functions  within  finite 
element  solutions  of  the  vector  Helmholtz  equation 


Andrew  F.  Peterson 

School  of  Electrical  and  Computer  Engineering 
Georgia  Institute  of  Technology 
Atlanta,  GA  30332-0250 

Donald R.  Wilton 

Department  of  Electrical  Engineering 
University  of  Houston 
Houston,  TX  77204-4793 


Abstract:  The  curl-curl  form  of  the  vector  Helmholtz  equation  can  be  used  to  describe  the 
behavior  of  three-dimensional  time-harmonic  electromagnetic  fields.  Finite  element  discretizations 
of  this  equation  encounter  difficulties  related  to  the  presence  of  numerous  eigenfunctions  belonging 
to  the  nullspace  of  the  operator,  in  addition  to  the  desired  eigensolutions.  Spurious  numerical 
solutions  arising  in  waveguide  and  cavity  formulations  appear  to  be  caused  by  the  inability  of  the 
basis  functions  to  properly  model  the  nullspace  eigenfunctions,  and  can  be  alleviated  by  using  a 
complete  polynomial  basis  that  only  imposes  tangential  continuity  from  cell  to  cell.  However,  such 
an  expansion  tends  to  capture  many  eigenfunctions  from  the  nullspace,  and  produces  a  relatively 
large  number  of  zero  eigenvalues.  A  reduction  in  the  number  of  zero  eigenvalues  can  be  obtained 
through  the  use  of  special  mixed-order  functions,  such  as  the  “edge  elements”  proposed  by 
Nedelec  in  1980.  Although  the  lowest-order  edge  elements  are  in  widespread  use  in  the 
electromagnetics  community,  their  extension  to  higher  polynomial  orders  has  been  inhibited  by  an 
incomplete  understanding  of  their  properties.  In  this  paper,  we  demonstrate  that  polynomial- 
complete  basis  functions  can  be  separated  into  two  subsets,  one  of  which  has  zero  curl  and  can 
only  represent  eigenfunctions  in  the  nullspace  of  the  operator.  Members  of  this  set  can  be 
discarded  to  improve  the  efficiency  of  solution,  leaving  mixed-order  basis  functions.  Specific 
examples  of  linear,  quadratic,  and  cubic  order  basis  functions  will  be  presented.  Numerical  results 
will  be  used  for  illustration. 


1.  Introduction 

The  curl-curl  form  of  the  vector  Helmholtz  equation  can  be  used  to  describe  the  behavior  of 
time-harmonic  three-dimensional  electromagnetic  fields.  Finite  element  discretizations  of  this 
equation  encounter  difficulties  related  to  the  presence  of  numerous  eigenfunctions  belonging  to  the 
nullspace  of  the  operator,  in  addition  to  the  desired  eigensolutions  [IJ.  For  example,  waveguide 
and  cavity  formulations  based  on  Lagrangian  interpolation  polynomials  produce  spurious  nonzero 
eigenvalues,  believed  to  be  grossly  inaccurate  approximations  of  the  zero  eigenvalues  associated 
with  the  nullspace  of  the  operator.  In  recent  years,  certain  types  of  mixed-order  vector  basis 
functions  (“edge  elements”)  have  been  shown  to  eliminate  the  .spurious  eigenvalues,  or  at  least  to 
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approximate  those  eigenfunctions  accurately  enough  to  produce  the  proper  eigenvalues  of  zero. 
However,  the  process  by  which  they  accomplish  this  is  not  widely  understood.  In  addition, 
several  different  families  of  vector  basis  functions  have  been  proposed  [1-6],  Only  a  few  of  these 
higher-order  basis  functions  have  been  subjected  to  systematic  numerical  tests  to  evaluate  their 
performance.  As  an  additional  point  of  confusion,  the  functions  proposed  for  triangular  and 
tetrahedral  cells  differ  from  those  used  with  quadrilateral  and  hexahedral  cells,  both  in  the  available 
number  of  degrees  of  freedom  and  in  the  mathematical  form  of  the  representation.  Finally,  the 
merits  of  mixed-order  edge  elements  versus  polynomial-complete  edge  elements  remains  in 
question  [7]. 

The  purpose  of  this  paper  is  to  attempt  to  provide  a  rationale  for  the  use  of  mixed-order 
vector  basis  functions  with  the  curl-curl  equation.  We  first  consider  the  nature  of  the  eigensolution 
families,  then  demonstrate  that  polynomial-complete  functions  that  do  not  impose  normal 
continuity  from  cell  to  cell  properly  model  the  nullspace  of  the  operator.  Complete  polynomial 
expansions  can  be  separated  into  two  subsets,  one  of  which  has  zero  curl  and  will  only  contribute 
to  additional  eigenfunctions  in  the  nullspace.  Members  of  this  set  can  be  discarded  to  improve  the 
efficiency  of  solution,  leaving  a  mixed-order  representation  of  the  Nedelec  type  [2J.  Specific 
examples  of  linear,  quadratic,  and  cubic  order  basis  functions  for  triangular  cells  will  be  presented. 
Numerical  results  will  be  used  for  illustration. 


2.  Properties  of  the  vector  Helmholtz  eigensolutions 

Consider  the  vector  Helmholtz  equation 


VxVxE  =  k^E  (1) 

for  the  electric  field  in  a  homogeneous  source-free  region.  Eigenfunctions  of  this  equation  can 
generally  be  separated  into  two  families,  one  of  which  is  a  valid  electromagnetic  field  of  the  form 
( E  =  V  X  V } ,  and  the  other  of  which  has  the  fomi  { E  =  Vd) } .  The  gradient  Vcp  is  a  mathematical 
solution  to  (1)  but  does  not  represent  an  electromagnetic  field  in  a  source-free  region.  Since 
V  X  VO  =  0,  these  solutions  only  satisfy  (1)  for  k  =  0.  Such  eigenfunctions  are  said  to  fonn  the 
“nullspace”  of  the  curl-curl  operator.  Both  eigenfamilies  satisfy  the  boundary  conditions  as  well  as 
the  Helmholtz  equation.  An  example  of  one  family  of  continuous  eigenfunctions  from  the 
nullspace  for  a  2D  cavity  is  provided  in  an  earlier  article  [  1  ]. 

The  solution  family  (VO)  is  of  interest,  even  when  k  is  not  zero,  because  a  general 
discretization  of  the  Helmholtz  operator  will  capture  eigenfunctions  from  both  families.  In  other 
words,  unless  the  basis  functions  are  orthogonal  to  all  functions  in  the  nullspace,  a  matrix 
representing  the  curl-curl  operator  will  have  some  eigenvectors  that  approximate  those  functions. 
Both  eigensolution  families  must  maintain  tangential  continuity  across  any  mathematical  boundary, 
and  in  the  absence  of  medium  discontinuities  the  electromagnetic  fields  {E  =  V  x  V}  must  also 
exhibit  normal  continuity.  However,  members  of  the  family  {VO)  may  exhibit  jump 
discontinuities  in  their  normal  component  while  still  maintaining  the  property  that  V  x  VO  =  0.  In 
fact,  experience  suggests  that  numerical  solutions  of  the  form  {VO)  tend  to  be  highly 
discontinuous  functions.  It  appears  that  the  spurious  nonzero  eigenvalues  obtained  when 
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discretizing  (1)  with  traditional  Lagrangian  functions  are  a  consequence  of  the  use  of  continuous 
basis  functions  to  approximate  the  highly  discontinuous  eigenfunctions.  It  is  easy  to  show  that  the 
projection  of  a  discontinuous  function  with  zero  curl  onto  a  continuous  basis  set  results  in  a 
function  having  nonzero  curl,  and  thus  a  “spurious”  nonzero  eigenvalue. 

Actual  solutions  of  the  more  general  vector  Helmholtz  equation 

V  xE)  =  khrE  (2) 

exhibit  jump  discontinuities  in  the  field  components  normal  to  material  interfaces  (which  typically 
coincide  with  cell  boundaries  in  a  finite  element  solution).  Therefore,  it  appears  that  discretizations 
of  (1)  and  (2)  should  employ  a  basis  set  that  imposes  tangential  continuity  but  not  normal 
continuity.  Such  basis  functions  are  known  as  “curl  conforming”  [2j. 


3.  Polynomial-complete  expansions  for  triangular  cells 

Polynomial-complete  curl-conforming  basis  functions  have  been  used  by  Nedelec  [3],  Mur 
[7],  and  others  for  triangular  and  tetrahedral  cells.  Six  linear-order  basis  functions  overlap  a 
triangular  cell.  Within  a  cell,  they  have  the  obvious  Cartesian  representation 

B(x,y)  =  X  (  A  +  Bx  -h  Cy)  +  y  (D  -t-  Ex  -f-  Fy)  (3) 

containing  six  degrees  of  freedom.  From  this  representation,  specific  basis  functions  can  be 
obtained  that  interpolate  to  the  tangential  component  at  the  ends  of  each  edge  of  the  cell.  These 
functions  maintain  tangential  continuity  by  sharing  that  coefficient  with  the  analogous  function 
defined  in  the  adjacent  cell  but  do  not  impose  normal  continuity  from  cell  to  cell.  There  are  a  total 
of  two  basis  functions  per  edge  throughout  the  model.  Table  1  shows  the  simplex-coordinate 
representation  of  these  basis  functions.  Table  2  shows  numerical  eigenvalues  produced  when 
these  basis  functions  are  used  to  discretize  (1)  for  a  circular,  homogeneous  cavity.  The  results 
contain  a  large  number  of  zero  eigenvalues  that  presumably  repre.sent  the  nullspace,  and  nonzero 
eigenvalues  that  appear  to  have  a  one-to-one  correlation  with  analytical  results  for  the 
electromagnetic  cavity  modes. 

Analogous  basis  functions  can  be  created  that  provide  a  complete  quadratic  representation. 
An  expansion  of  the  form 

B(x,y)  =  x(A-hBx-i-Cy  +  Dx^-(-Exy  +  Fy^) 

+  y  (G  +  H  x  -t- 1  y  -i-  J  x^  -(-  K  xy  +  L  y^)  (4) 

contains  twelve  degrees  of  freedom,  and  basis  functions  can  be  defined  (for  instance)  that 
interpolate  to  three  tangential  components  along  each  edge  of  a  triangular  cell  and  one  normal 
component  at  the  middle  of  each  edge.  Their  expression  in  simplex  coordinates  appears  in  Table  1 . 
The  nine  quadratic  basis  functions  that  interpolate  to  tangential  components  share  a  coefficient  with 
the  analogous  function  in  the  neighboring  cell  in  order  to  maintain  tangential  continuity.  The  three 
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basis  functions  per  cell  that  interpolate  to  the  normal  component  are  entirely  local  and  thus  the 
normal  component  is  not  constrained  to  be  continuous  from  cell  to  cell.  The  resulting 
representation  requires  three  unknowns  per  edge  and  three  unknowns  per  cell.  Table  2  shows 
numerical  eigenvalues  produced  using  these  basis  functions  to  model  fields  within  a  circular  cavity. 
The  results  contain  a  large  number  of  zero  eigenvalues,  and  nonzero  eigenvalues  that  appear  to 
have  a  one-to-one  correspondence  with  analytical  results  for  the  cavity  modes. 


Table  I 

Simplex-coordinate  definition  of  polynomial-complete 
basis  functions  throughout  a  triangular  cell. 

Linear  (LTA-N) 

Ouadratic  (OT/ON) 

43(2^3-1)V^i 

13(253-1)7^2 

^2V^3 

5i(25,-I)V53 

52(252-1)753 
5253(752  -  753) 
4i^3(753-V5,) 
5,52(75i-752) 
^2^3751 
^113752 

4i42753 

Table  2 

Lowest  eigenvalues  produced  by  a  discretization  of  a 

circular  cavity  with  Er=l,  and 

unit  radius  using 

polynomial-complete  linear  (LT/LN)  and  quadratic 

(QT/QN)  basis  functions,  for  the  TE  polarization.  The 

model  consisted  of  42  triangular  cells  and  resulted  in  a 

matrix  of  order  108  for  the  LT/LN  functions  and  288  for 

the  QT/QN  functions. 

LT/LN 

QT/QN 

exact 

0.0  (67) 

0.0  (163) 

1.87  (2) 

1.84  (2) 

1.841  (2) 

3.20  (2) 

3.06  (2) 

3,054  (2) 

4.13  (1) 

3.84  (1) 

3.832  (1) 

4.6  (2) 

4.21  (2) 

4.201  (2) 

6.0  (2) 

5.35  (2) 

5.318  (2) 

6.1  (2) 

5.37  (2) 

5.331  (2) 
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4.  Mixed-order  expansions  for  triangular  cells 

The  data  in  Table  2  suggest  that  polynomial-complete  functions  are  sufficient  for 
eliminating  spurious  nonzero  eigenvalues,  as  long  as  they  impose  only  tangential  continuity  from 
cell  to  cell.  However,  it  appears  that  a  large  fraction  of  the  available  degrees  of  freedom  in  the 
expansions  of  (3)  and  (4)  are  used  to  capture  eigensolutions  in  the  nullspace  of  the  curl-curl 
operator.  To  reduce  wasted  computational  effort,  some  of  the  degrees  of  freedom  associated  with 
the  nullspace  can  be  eliminated.  Nedelec  appears  to  be  the  first  to  (a)  observe  that  the  degrees  of 
freedom  associated  with  the  gradient  of  an  order-(n+l)  polynomial  belong  to  the  nullspace  of  an 
order-n  representation,  and  (b)  develop  basis  functions  with  reduced  degrees  of  freedom  [21. 

The  linear  vector  basis  function  in  (3)  can  be  projected  onto  two  subspaces  to  obtain 


B(x,y)  =  X  {A  -H  (C-E)/2  y}  +  y  {D  +  (E-C)/2  x)  (5) 

and  a  complementary  representation 

Bgrad(x,y)  =  X  (Bx  +  (C-hE)/2  y)  +  y  {(E-rC)/2  x  -h  Fy}  (6) 

Equations  (5)  and  (6)  each  contain  three  degrees  of  freedom.  The  functions  in  (6)  have  identically 
zero  curl  within  the  cell.  Since  these  functions  are  constrained  to  have  tangential  continuity  from 
cell  to  cell,  their  curl  is  identically  zero  over  the  entire  problem  domain.  Under  these  conditions, 
(6)  is  the  general  form  of  the  gradient  of  a  quadratic  polynomial,  and  basis  functions  with  this  form 
can  only  represent  functions  in  the  nullspace  of  the  curl-curl  operator.  Thus,  it  should  be  possible 
to  restrict  the  basis  set  to  the  functions  in  (5).  The  reduced  expansion  defined  by  (5)  consists  of 
mixed-order  polynomial  functions  that  provide  a  constant  tangential  component  arid  a  linear  norma! 
(CT/LN)  component  along  any  cut  through  the  cell.  Simplex  coordinate  descriptions  of  the  CT/LN 
functions  are  provided  in  Table  3.  Table  4  shows  numerical  eigenvalues  produced  by  the  CT/LN 
basis  set  when  used  to  discretize  (1)  for  a  circular,  homogeneous  cavity  of  unit  radius.  As 
compared  to  the  complete  linear  (LT/LN)  data  in  Table  2,  the  results  contain  the  same  number  of 
nonzero  eigenvalues  but  far  fewer  zero  eigenvalues  (the  number  of  zero  eigenvalues  is  reduced  by 
exactly  the  number  of  basis  functions  excluded,  which  is  half  the  original  matrix  order). 

The  quadratic  representation  in  (4)  can  be  modified  in  an  analogous  manner.  The  complete 
expansion  can  be  projected  onto  two  subspaces,  to  produce  a  representation  of  the  form 

B(x,y)  =  x{A-i-Bx  +  Cy  +  (E-2J)/3  xy  +  (2F-K)/3  y^} 

-fylG-i-Hx-t-ly-i-  (2J~E)/3  x^  +  (K-2F)/3  xy}  0) 

containing  eight  degrees  of  freedom  and  a  complementary  representation 

B„„d(x.y)  =  X  (D  +  2(E+J)/3  xy  +  (F+K)/3  y^] 

-1-  y  { (E+J)/3  x^  +  2(K-i-F)/3  xy  -r  L  y^ }  (») 

containing  four  degrees  of  freedom.  It  can  be  seen  that  (8)  has  identically  zero  curl,  and  thus  those 
degrees  of  freedom  only  repre.sent  functions  in  the  nullspace  of  the  curl-curl  operator.  Therefore,  it 
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Table  3 


Simplex-coordinate  definition  of  mixed-order  basis  functions  within  a  triangular 
cell.  The  last  two  LT/QN  and  last  six  QT/CuN  functions  are  entirely  l(x:al. 


Table  4 

Lowest  eigenvalues  produced  by  a  discretization  of  a 
circular  cavity  with  £r=l,  M.r=L  and  unit  radius  using 
mixed-order  basis  functions,  for  the  TE  polarization.  The 
42-cell  model  produced  a  matrix  with  order  54  for  the 
CT/LN  functions  and  192  for  the  LT/QN  functions. 


CT/LN  I  LT/QN  |  exact 


0.0  (13)  0.0  (67) 

1.86  (2)  1.84  (2)  1.841  (2) 

3.10  (2)  3.05  (2)  3.054  (2) 

3.82  (1)  3.84  (1)  3.832  (1) 

4.28  (2)  4.20  (2)  4.201  (2) 

5.27  (2)  5.32  (2)  5.318  (2) 

5.39  (2)  5.35  (2)  5.331(2) 
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should  be  possible  to  restrict  the  expansion  to  the  eight  degrees  of  freedom  in  (7).  These  happen  to 
produce  a  linear-tangential,  quadratic-normal  (LT/QN)  representation,  and  basis  functions  can  be 
defined  that  interpolate  to  the  tangential  and  normal  components  in  several  ways.  One  specific 
form  of  the  basis  functions  is  given  in  simplex  coordinates  in  Table  3.  Table  4  shows  numerical 
eigenvalues  produced  by  the  LT/QN  functions  when  used  to  discretize  (1)  for  a  circular, 
homogeneous  cavity  of  unit  radius.  As  compared  to  the  complete  quadratic  case,  the  results 
contain  the  same  number  of  nonzero  eigenvalues  but  far  fewer  zero  eigenvalues;  the  number  of 
zero  eigenvalues  is  reduced  by  exactly  the  number  of  basis  functions  excluded. 

It  is  noteworthy  that  the  six  linear  basis  functions  in  Table  1  form  a  subset  of  the  LT/QN 
functions  defined  in  Table  3.  By  comparing  Tables  2  and  4  we  observe  that  the  LT/QN  expansion 
produces  exactly  the  same  number  of  zero  eigenvalues  as  the  complete  linear  expansion.  This 
suggests  that  the  additional  basis  functions  used  to  build  up  the  set  of  8  LT/QN  functions  do  not 
contribute  to  eigenfunctions  in  the  nullspace.  However,  their  addition  to  the  set  of  six  LT/LN 
functions  in  Table  1  clearly  improves  the  accuracy  of  the  nonzero  eigenvalues. 

A  general  cubic  polynomial  representation  of  a  vector  function  in  2D  contains  20  degrees  of 
freedom.  There  are  five  degrees  of  freedom  associated  with  the  gradient  of  a  fourth-order 
polynomial,  which  can  be  excluded  to  reduce  the  Cartesian  form  of  the  basis  functions  to 

B(x,y)  =  X  { A  -r  B  X  +  C  y  -r  D  x^  +  E  xy  +  F  y^  +  G  x^y  +  H  xy^  31  y^ } 

-r  y  {J  -h  K  X  -r  L  y  +  M  x^  -r  N  xy  -r  O  y^  -  3G  x^  -  H  x^y  -  1  xy^)  (9) 

Within  a  cell,  one  possible  form  of  the  15  basis  functions  is  given  in  simplex  coordinates  in  Table 
3.  Nine  of  these  functions  interpolate  to  the  tangential  vector  component  along  cell  edges,  while 
six  functions  build  up  the  normal  component.  Together,  these  15  basis  functions  provide  a 
representation  with  quadratic  tangential  and  cubic  normal  components  (QT/CuN). 

The  CT/LN,  LT/QN  and  QT/CuN  basis  functions  in  Table  3  are  consistent  with  Nedelec’s 
spaces  [2].  Cendes’(  5]  proposed  functions  with  the  same  number  of  degrees  of  freedom  per  cell  as 
the  Nedelec  functions,  but  with  a  different  mathematical  form  than  the  LT/QN  and  QT/CuN 
functions  in  Table  3.  Webb  and  Forghani  have  proposed  hierarchal  vector  basis  functions  for 
tetrahedral  cells  [6].  Their  functions  appear  to  incorporate  those  of  Cendes  [5],  and  therefore  are 
not  consistent  with  the  Nedelec  spaces  in  [2|. 


5.  Mixed-order  expansions  for  quadrilateral  cells 


Nedelec  also  originally  proposed  mixed-order  basis  functions  for  quadrilateral  and 
hexahedral  cells  [2].  These  functions  have  a  different  number  of  degrees  of  freedom  than  those 
used  on  triangles  and  tetrahedral  cells,  and  a  different  mathematical  expression  in  Cartesian 
coordinates  than  the  mixed-order  functions  developed  in  the  preceding  section.  The  relationship 
between  these  basis  functions  can  be  seen  as  follows. 

The  mixed-order  functions  for  triangular  cells  are  somewhat  optimal  in  that  they  discard  all 
the  obvious  degrees  of  freedom  associated  with  the  nullspace.  There  is  no  reason  why  some  of 
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these  degrees  of  freedom  cannot  be  kept  in  the  representation,  however.  For  instance,  as  an 
alternative  to  the  subspaces  defined  by  (7)  and  (8),  we  could  employ  the  projection 

]B(x,y)  =  x{A  +  Bx  +  Cy  +  Dxy  +  Ey^)+y{F  +  Gx+Hy+lx^  +  J  xy)  (10) 

omitting  the  subspace 

Bgrad(x,y)  =  X  K  x^  +  y  L  y^  (11) 

which  has  zero  curl.  This  way  of  separating  the  subspaces  eliminates  two  degrees  of  freedom, 
leaving  10.  Flowever,  12  degrees  of  freedom  are  required  to  build  up  a  linear  tangential  and 
quadratic  normal  (LT/QN)  component  along  the  sides  of  a  rectangular  cell.  The  polynomial 
components  in  (11)  do  not  contribute  to  an  LT/QN  expansion,  and  instead  it  is  convenient  to  add 
two  cubic-order  degrees 

xKxy^-fyLx^y  (12) 

The  resulting  expansion  provides  an  LT/QN  representation  along  the  cell  edges.  (The  expansion 
for  quadrilaterals  differs  from  that  used  with  triangles  in  that  it  is  not  purely  LT/QN  along  any  cut 
within  a  cell.)  Crowley  confirmed  that  these  functions  properly  represent  the  nullspace  and 
therefore  eliminate  spurious  modes  [4],  when  only  tangential  continuity  is  imposed. 

6.  Summary 

Spurious  eigenvalues  arising  with  discretizations  of  the  curl-curl  form  of  the  vector 
Helmholtz  equation  appear  to  be  caused  by  the  inability  of  a  continuous  basis  expansion  to 
properly  model  the  discontinuous  nullspace  eigenfunctions,  and  can  be  alleviated  by  using  an 
expansion  that  only  imposes  tangential  continuity.  Thus,  a  wide  variety  of  basis  functions  can  be 
used  successfully,  as  long  as  they  do  not  impose  normal  continuity.  Mixed-order  basis  functions 
of  the  Nedelec  variety  [2]  help  to  reduce  the  computational  requirements  of  a  vector  finite  element 
implementation  by  eliminating  some  of  the  degrees  of  freedom  associated  with  the  nullspace. 
Linear,  quadratic,  and  cubic  order  basis  functions  of  this  type  have  been  discussed. 
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Introduction:  Periodic  airays  find  applications  as  frequency  selective  surfaces,  spaUal  filters,  airay 
antennas  and  in  conugated  horns.  Here  we  illustrate  how  to  compute  the  electromagnetic 
characteristics  of  such  sumetures  using  widely  acces.siblc  finite  element  models  and  extended 
waveguide  simulator  concepts.  We  apply  (he  method  to  three  distinct  examples  and  verify  the 
solutions  by  other  means.  The  results  indicate  the  accuracy  and  many  useful  applications  of  the 
method.  The  first  example  is  the  calculation  of  the  transmission  coefficient  of  a  dichroic  surface.  In 
the  second  we  determine  the  surface  impedance  of  a  corrugated  surface.  Finally,  we  consider  the 
return  toss  of  waveguide  type  elements  in  large  arrays.  Tlie  generality  ol  the  technique  and  (he 
accessibility  of  commercial  finite  element  method  (FHM)  codes  make  the  procedure  extremely  useful 
for  verifying  other  numerical  models!  1]. 


Waveguide  simulators  are  used  extensively  for  the  characterisation  of  elements  in  large  planar  phased 
array  antcnnas[2].  In  a  properly  designed  simulator  the  element  has  the  reception  characteristics  of  an 
element  in  an  infinite  anay  environment.  Hence,  a  physically  bound  experimental  model  simulates  an 
unbound  scattering  problem.  The  method  has  limited  usefiilness  for  a  number  of  reasons.  First,  the 
periodic  element  needs  a  symmetry  plane  since  waveguide  modes  are  equivalent  to  pairs  of  symmetric 
plane  waves  and  each  plane  wave  in  the  pair  must  see  (he  identical  element  with  the  same  orientaUon. 
Additionally,  the  Incidence  angle  of  the  simulated  plane  wave  is  a  function  of  frequency.  Further,  the 
dominant  mode  of  the  waveguide  {TE  or  TM)  dictates  the  polarisation.  Therefore,  each  frequency  and 
incidence  angle  under  consideration  requires  the  construcUon  of  a  separate  waveguide  simulator. 
Measurements  using  the  simulators  requires  construction  of  adapters  to  standaid  size  tiansmission 
line. 

With  the  advent  of  sophisticated  finite  element  models,  some  of  the  shoiicomings  of  the  experimental 
model  can  be  overcome.  FEM  was  first  applied  to  waveguide  problems  in  the  late  sixtics[3,4]  but 
because  the  method  is  so  numerically  intensive  it  was  not  very  useful  until  the  development  of 
adequate  computers.  Now  a  variety  of  user  friendly  FEM  analysis  packages  are  commercially 
avallable[5,5a].  Finite  element  techniques  are  pailicularly  applicable  to  physically  bound  problems. 
We  take  advantage  of  this  by  using  a  bound  simulator  model  for  the  unbound  anay.  Rather  than 
building  a  new  simulator  for  every  frequency  or  incident  angle,  we  model  the  simulator  using  tfie 
finite  element  technique.  Tlie  computational  model  overcomes  many  limitations  of  the  expeiimental 
model.  Lossless  walls  are  introduced  to  consider  frequencies  near-  (he  waveguide  cut-off  frequency. 
Additionally,  the  use  of  magnetic  boundaiies  permit  new  waveguide  modes  to  propagate.  These 
utilities  allow  us  to  consider  orthogonal  polarisations  and  hence  circular  polarisation.  A  number  of 
symmetries  reduce  the  computation  time  significantly. 

Generalised  Simulator  Theory:  Tire  described  procedure  is  applicable  to  a  wide  variety  of  periodic 
geometries.  The  guidelines  of  convenUonal  waveguide  simulation  are  well  documentcdl2,6].  One  can 
conceive  a  variety  of  simulator  geometries,  each  coiresponding  to  different  planes  of  incidence  or 
polarisation,  l.et  us  first  c<uisider  only  rectangular-  waveguide  and  the  dual  simulator 

types.  All  TE  modes  In  rectangular  waveguide  consist  of //-polarised  waves,  and  all  TM  modes  in  the 
waveguide  consist  of  £-polarised  waves.  Hence,  each  waveguide  mode  defines  a  .specific  scattering 
situation  (i.e.  incidence  angle,  incidence  plane,  E  rrr  H  polarisation).  Figure  1  depicts  a  simulator 
containing  three  elements  of  a  periodic  sUucture.  We  use  this  to  calculate  the  electrical  characteristics 
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of  an  infiniie  periodic  structure  when  excited  by  an  E-polarised  plane  wave. 


“Symmetry  Walls" 

Figure  1  -  Magnetic  Walled  Waveguide  Simulator 

In  a  two  dimensional  periodic  array  there  arc  several  directions  in  which  planes  of  symmetry  reside. 
Additionally,  the  simulator  can  hold  a  number  of  the  stiajcture  unit  cells.  Hence,  the  dimensions  of  the 
simulator  are  chosen  by 

a  =  pX^  h  =  qx^  (1) 

where  and  are  the  periorlicities  in  each  of  their  respective  directions  and  p  and  q  arc  integers 
corresponding  to  the  number  of  unit  cells  in  the  simulator. 


In  general,  the  direction  of  plane  wave  propagation  may  be  represented  by  tlie  propagation  vector 
k  =  k^x  + k^y  +  k,z.  In  waveguide,  k  is  established  from  the  waveguide  mode  indices.  The 
waveguide  propagation  constants  in  the  various  directions  are  given  by, 

k^-'-^,  k^-—,  k:  =  k^^~  -  k]  -  kl  w,/7  =...-2,-1, 0, 1,2,...  (2) 

a  b 

where  m  and  n  are  integers  corresponding  to  the  waveguide  mode  and  k^^  =2%  I X.  Clearly,  the  values 
of  m  and  n  are  limited  if  k^  is  to  have  a  real  value.  Tlie  incidence  angle  for  higher  order  mode 
propagation  is  given  by 


k  k 

sin0.=-^,  lantj),  =—  where 

kn  k,  p  X  y 


An  interesting  consequence  of  these  formulas  is  that  a  waveguide  anay  element  with  thin  walls  and 
dimensions  a,  b,  it  is  perfectly  matched  at  the  scan  angle  dictated  by  the  above  equation.  This  is 


obvious  .since  the  waveguide  element  and  the  waveguide  simulator  have  tlie  same  dimensions 


We  now  state  the  grating  lobe  condition  in  terms  (T  propagation  vectors.  Assuming  a  rectangular 
lattice,  the  scattered  beam  positions  are  given  by 

(4 )  =  K (/, )  =  k; - k, 1^,1^.  =. . .-2,- 1, 0,1, 2, ...  (4) 

X  ^  T 


where  and  /^,  are  the  indices  of  the  grating  lobe  and  .v  and  i  denote  respectively  scattered  and 
incident  field.  Combining  this  with  relations  (1)  and  (2), 


px^  X  ,  a  ^  '  qx^  ^ 


Figure  2  depicts  equation  (5)  in  A'-space,  showing  only  the  positive  m  and  n  values.  A  grating  lobe 
(Kcurs  at  each  IA'^//li<A7).  Tlie  bracketed  term  in  (5)  is  an  integer  and  for  each  positive  value  of  this 
integer  tlierc  is  a  conesponding  negative  value.  Hence,  for  every  grating  lobe  there  are  tlircc 
symmetric  graUng  lobes.  Tliis  applies  to  the  evanescent  fields  as  well  (!A'^.//jt>A70.  This  conforms  with 
our  supposition  that  waveguide  modes  aie  equivalent  to  combinations  of  symmetric  plane  waves.  Tlie 
n-2lyq)  waveguide  simulator  modes  conespond  to  grating  lobes.  It  may  turn  out  that  the 
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newly  excited  waveguide  mode  is  the  degenerate  mode.  In  this  case  the  grating  lobe  is  Uie 

min-or  image  of  the  uansmittcd  (incident)  field  and  no  new  mode  is  excited  in  the  waveguide 
simulator.  Hence,  one  may  have  a  graUng  lobe  without  exciting  a  new  waveguide  mode  but  not  the 
converse.  Tliere  is  always  grating  lobe  when  a  new  waveguide  mode  is  excited, 


FEM  Simulator  Models:  We  now  turn  our  attention  to  the  capabilities  of  the  finite  element 
numerical  simulator  model  relative  to  conventional  waveguide  simulators.  Tlic  simulator  shown  in 
figure  1  is  not  physically  realisable  because  of  its  "magnetic  walls  .  The  magnetic  walls  aie 
composed  of  perfect  magnetic  conductors  (PMC)  and  have  tangential  //-fields  equal  to  zero.  Although 
such  materials  are  not  known  to  exist  in  nature,  we  can  construct  a  FhM  numerical  model  of  the 
simulator  and  hence  determine  the  E-polaiised  scattering  characteristics  of  the  periodic  structure.  For 
an  //-polarised  plane  wave  we  apply  duality  and  exchange  the  electric  walls  with  magneUe  walls, 


The  simulator  geometries  used  in  experimental  models  aie  not  necessarily  the  best  for  numerical 
models.  If  the  simulator  is  large  enough,  higher  order  modes  also  propagate.  Measurement  using  these 
higher  order  modes  poses  many  problems.  That  is  not  the  case  using  FEM  computations.  In  total, 
there  are  three  useful  simulator  geomeUics,  as  shown  in  figure  3. 


Figure  3  -  Simulator  Geometries 


Magnetic  Walls 


The  left  most  simulator  geometry  is  the  conventional  experimental  type  and  the  middle  is  as  discussed 
above.  Tlie  simulator  geonretry  (m  the  right  uses  both  elecuic  and  magnetic  walls.  The  lundamental 
TEM  mode  propagating  in  lliis  "waveguide"  is  a  uniform  plane  wave  over  the  bounds  of  the  guide. 
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Image  (110017  dictates  that  the  unit  cells  in  tills  guide  see  what  is  eciuivalent  to  normal  incidence  on  the 
infinite  periodic  sUucture.  Normal  incidence  can  not  be  considered  using  conventional  waveguide 
simulators.  We  obtain  horizontal  polarisation  by  simply  exchanging  the  magnetic  and  electric  walls. 
In  all  cases,  we  obtain  circular  (elliptic)  polarised  reflection  by  taking  the  complex  sum  of  the  £  and  H 
polarised  reflection  coefficients  with  the  appropriate  phase  delay  (and  amplitude  weighting). 

From  (5)  we  recognise  a  restriction  on  the  observation  angles.  We  can  compute  only  a  discrete  set  of 
and  corresponding  to  the  integers  m,  n,  p  and  q.  The  range  of  observation  points  increases  with 
larger  p  and  </  as  does  the  computer  processing  time.  After  a  single  compulation  using  finite  element 
technique,  the  resulting  matrix  allows  us  to  determine  the  coupling  between  all  the  propagating 
mcxles.  Hence,  if  the  simulator  supports  NxM  modes,  then  we  know  the  refiection  coefficient  for  NxM 
incident  angles.  Normally  we  are  interested  in  results  over  a  frequency  band.  In  such  cases  we  set  p 

and  q  to  some  nominal  value  and  compute  results  for  ni=I,2,...M  and  u=/,2 . N  and  a  set  of 

frequencies This  gives  us  a  total  of  MxNxR  point  over  which  we  interpolate. 

Tlic  required  computational  time  is  proportional  to  the  square  of  the  volume  of  the  simulator.  We 
reduce  this  time  by  75%  by  placing  a  perfectly  conducting  electric  wall  along  the  centreline  of  the 
structure.  Tliis  is  permitted  since  the  symmetry  dictates  that  the  electric  field  is  zero  along  this  plane. 
For  doubly  periodic  structures  a  similar  symmetry  plane  allows  an  additional  reduction  in  (he  volume, 
making  the  computation  time  about  1/16  of  the  original  for  (he  same  numerical  accuracy.  Instead  of 
the  FRM,  the  simulator  geometry  can  be  considered  using  FDTD.  In  this  ca.se  we  are  able  to  process 
multiple  frequencies  in  a  single  run  but  only  with  a  single  mode.  Tlierefore  we  trade  multiple  incident 
angles  for  multiple  frequencies. 

Reviewing  the  general  formulation,  (he  angle  of  incidence  is  equivalent  to  knowing  the  progressive 
phase  change  of  the  cunents  on  each  airay  element  and  is  dictated  by  the  waveguide  simulator  mode 
and  the  unit  cell  dimensions  (rn,  n,  p,  q,  x^,).  Fven  if  we  excite  the  structure  with  a  single 
waveguide  mode,  other  modes  may  be  reflected.  Tliese  other  modes  conespond  to  symmetric  pairs  of 
grating  lobes.  Tlic  relative  amplitudes  of  these  grating  lobes  are  the  same  as  the  relative  power  of  llie 
newly  excited  nioiles.  Here  we  consider  only  rectangular  lattices  but  the  observations  are  equally  valid 
for  triangular  lattices. 

Case  Studies:  Tiie  usefulness  of  the  method  is  best  Illustrated  by  a  few  examples.  For  each  of  the 
cases  we  develop  or  duplicate  an  alternate  analysis  method.  Tlie  results  of  each  example  are 
coiToboratcd  by  the  alternate  method  and  (he  accuracy  is  examined.  Tlie  examples  also  give  some 
insight  into  the  functioning  of  the  periodic  stnactures. 

a.  Reflection  From  a  Dichroic  Surface 

In  this  section,  we  calculate  the  refiection  through  (he  crossed-dipole  surface  shown  in  figure  4  using 
the  FFM  simulator  model  and  a  spectral  moment  method.  Tliis  periodic  geometry  is  considered  in  a 
number  of  papers[7,8J,  We  investigate  normal  incidence  and  hence  use  a  simulator  with  two  electric 
walls  and  two  magnetic  walls.  Figure  5  shows  the  cross  section  of  the  simulator  unit  cell  with  tlie 
magnitude  of  the  £-field  as  calculated  by  FFM.  Tlie  left  and  right  walls  are  FMC  while  the  top  and 
bottom  aie  PFC.  As  expected,  the  £-field  over  the  conductor  is  small  since  the  tangential  component 
of  £-fiekl  is  zero. 

For  compar  ison  purposes  we  u.sc  the  full-wave  moment  method  analysis  for  an  illuminating  field  that 
is  applicable  for  any  arbitrary  incident  angle.  The  method  is  described  in  detail  by  Tsao  and  Mittra[7) 
and  outlined  as  follows.  An  integral  equation  is  developed  for  the  fields  due  to  a  set  of  currents  on  the 
free-standing  crosses.  We  assume  the  current  on  each  cross  to  be  identical  except  for  a  progressive 
phase  term.  This  assumption  coiresponds  to  plane  wave  incidence.  Tlie  problem  is  formulated  In  the 
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spectral  domain.  This  reduces  the  convolution  form  of  the  integral  equation  for  the  induced  current 
into  an  algebraic  one.  We  use  Galerkin’s  procedure  in  order  to  solve  the  spectral  domain  equation.  To 
account  for  the  discontinuous  nature  of  the  induced  current  at  the  junction  of  the  cross  we  use  a  set  of 
entire  domain  "junction  basis  functions". 


Figure  4  -  Dichroic  surface  geometry  Figure  5  -  Simulator  unit  cell  and 

printed  on  0.127mm  kapton  (£^=4.25)  the  £- field  magnitude 


Figure  6  shows  the  computed  power  transmission  coefficients  for  the  free  standing  cross  frequency 
selective  surface.  The  left  curve  is  for  free  standing  crosses  and  the  right  is  for  crosses  on  a  sheet  ot 
kapton.  As  one  would  expect,  the  dichroic  is  a  good  microwave  rellector  when  the  elements  are 
roughly  one  half  wavelength  in  the  E-plane. 
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Figure  6  -  Dichroic  Refleclion  Coefficient 


Tlie  FEM  results  match  the  measured  results  except  for  a  0.8GHz  shift  in  frequency.  The  FEM 
simulator  results  compare  adequately  with  those  of  the  spectral  Galerkin  moment  method.  The 
moment  method  routine  considers  the  crosses  to  have  squared  ends  rather  than  rounded.  Tliis  may 
account  for  the  dispai  ity.  Validation  of  this  case  is  paiticulaiiy  useful  since  the  documented  methods 
do  show  some  discrcpancies[7,8]. 
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b.  Swface  Impedance:  We  now  direct  our  attention  to  the  computation  of  the  surface  impedance  of  an 
infinite  impenetrable  planai-  surface,  Knowledge  of  the  surface  impedance  allows  boundtiry  value 
problems  to  be  solved[9].  If  the  surface  can  be  characterised  by  a  surface  impedance,  the  surface  has  a 
periodicity  smaller  than  a  wavelength.  The  surface  impedance  of  a  periodic  shucture  is  a  lunction  of 
the  plane  of  incidence.  A  distinction  must  be  made  between  the  macroscopic  and  microscopic 
viewpoints.  From  a  microscopic  view  the  surface  has  drastically  varying  field  magnitudes  and  hence  a 
nonuniform  surface  impedance.  However,  the  aggregate  effect  of  these  fields  may  be  represented  in 
the  large  by  a  parameter  such  as  Z^.  In  general,  this  surface  impedance  will  be  a  function  of  tlie 
orientation  and  iiicitlence  angle.  Tlie  surface  impedance  seen  by  the  incident  wave  is  found  from 

2  =  c  j  j (Ixdy  =  (6) 

0  0  lan  ^  ^ 

where  a  and  b  describe  a  unit  cell,  m  and  n  are  integers  and  C  is  a  normalisation  constant.  Evanescent 
fields  exist  neai  the  surface  but  integrate  to  zero  over  the  unit  cell.  Hence,  we  obtain  the  same  surface 
impedance  magnitude  regardless  of  the  calculation  plane  (z=c,^jr). 

For  demonstration  putposes  we  consider  a  perfectly  conducting  corrugated  surface  with  a  dielectric  in 
the  corrugations.  Corrugated  surfaces  are  widely  used  in  the  design  of  electromagnetic  structures. 
Consider  an  obliquely  incident  TM  (E-po!arised)  plane  wave  as  shown  in  figure  7.  We  determine  tlie 
surface  admittance  over  a  range  of  incidence  angles  and  frequencies  for  a  plane  wave  with  a  path  of 
travel  transverse  to  the  conugations. 

We  choose  the  corrugation  depth  in  such  a  way  as  to  have  the  tians verse  magnetic  fields  cancel  at  the 
top  of  the  coniigations.  Tins  results  in  a  large  reactance  (small  susceptance),  as  shown  in  figure  8, 
where  q,,  is  the  free  space  impedance. 


1 1 1 11 1 1 II 


IT 


G  =  0.09^,, 


Figure  7  -  Transverse  Propagating  TM  Incidence 

Extensive  documentation  of  the  infinite  coiTugated  surface  and  the  infinite  periodic  anay  is 
availablei  lOJ.  Pi’evious  studies  use,  among  other  techniques,  mode  matching,  moment  method  and 
direct  analytical  solutions.  Here  wc  u,se  the  mode  matching  technique  as  described  by  Wangfll,  pp 
36.S-376]  to  verify  the  FEM  simulator  results  for  a  dielectric  filled  coiTugated  surface.  A  brief 
description  of  the  procedure  is  as  follows.  We  first  divide  the  space  into  two  regions.  Integral 
equations  are  formulated  by  enforcing  boundaiy  conditions  at  the  interface.  We  use  parallel  plate 
cavity  modes  to  represent  the  field  inside  the  corrugation.  These  modes  fulfil  the  boundary  conditions 
of  the  cavity  formed  by  the  conugation.  We  equate  the  tangential  electric  and  magnetic  fields  of  tlie 
two  regions  in  the  aperture.  Tltis  gives  the  unknown  coefficients  of  the  free  space  Fdoquet  modes. 
Assuming  the  conugations  are  spaced  close  together,  there  is  a  single  specular  refiection.  Its 
amplitude  con-esponds  to  that  of  the  zeroth  order  Floquet  mode.  The  coefficient  of  this  propagating 
mode  gives  the  surface  impedance  of  the  stniclure. 

The  FEM  simulator  and  the  mode  matching  results  ajc  shown  in  figure  8.  Also  shown  is  the 
difference  between  the  re.sults.  The  FEM  simulator  approach  yields  results  w'ithin  about  2%  of  the 
mode  matching  solution. 
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Figure  8  -  Corrugated  Structure  Surface  Admittance 


c  Return  Loss  of  Waveguide  Elements  with  Dielectric  U>aded  Walls  ^ 

Waveeuide  slmulalors  arc  mos(  often  used  for  Ifte  design  of  phased  array  eleraeius.  We  caleulalt  ihe 
reflectton  of  Ihe  array  shown  in  figure  9,  U  consists  of  perfeclly  conducling  parailel  plate  waveguides 
wurhomogeneous  dLccUic  slabs  on  the  side  walls.  If  the  dielectric  slabs  are  chosen  using, 

slab  thickness  =  ,  ,  ~ 


the  geometry  provides  a  relaUvely  unifonn  aperture  distribution.  Therefore  it  gives  a  higher  directivity 
man  crvTntiona  waveguide  dements  with  the  same  aperture  dimensions.  Results  for  a  Ihick-wal  ed 
waveguitle  element  with  dielectric  loaded  watts  are  not  available  in  open  literature,  Marlloux  and 
Steyskalf  12]  reported  a  similar  technique  for  a  less  general  case. 


Figure  9  -  Intiomogeneousiy  Loaded  Waveguide  Anay 

We  again  determine  Ihe  return  loss  of  die  array  elements  by  applying  the  mode  matching  lechraqum  A 
superposition  of  parallel  plate  inodes  represents  Ihe  fields  in  the  Inhomogeneous  dielectric  filled 
wawguide.  The  freld  radiated  into  free  space  is  a  sum  of  Roquet  modes  with  unknown 
Matching  the  langenlial  components  of  the  fields  at  the  aperture  reduces  the  pro  cm  to  a  sy.’  ^ 
hnearly  algebraic  equaUons  for  the  amplitudes  of  each  waveguide  mode  and  Roquet  tomonic.  Wc 
'^e  me  s]Lm  by  Gauss-Jordan  elimination  and  extract  the  reflection  coefficient.  The  technique 
includes  all  mutual  coupling  effects. 
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a.  Loaded  Waveguide  Element  Reflection  Coefficient 


b.  Difference  between  FEM  and  moment  method  results 
Figure  10  -  Inhomogeneously  Loaded  Waveguide  Results 


Tlic  reflection  coefficient  of  the  defined  array  is  shown  in  figure  10  along  with  the  difference  between 
the  approaches.  Tlie  FEM  simulator  and  the  mode  matching  reflection  coefficient  differ  by  less  than 
2%.  At  higher  frequencies  and  large  incident  angles  we  notice  a  ridge  conesponding  to  a  grating  lobe. 
The  techniques  are  both  valid  for  this  case  as  well  as  when  only  a  single  plane  wave  is  excited. 

The  effective  element  pattern  radiated  from  the  infinite  array  is  given  by 

/(e)  =  (i-|r(e)|-)co,s9  dD 

where  F  is  the  infinite  anay  rellection  coefficient  when  no  grating  lobe  radiatesl  12],  Figure  1 1  .shows 
the  effective  element  pattern  computed  using  the  waveguide  simulator  technique  (shown  with  points) 
and  the  mode  matching  method  descTibcd  above  (shown  with  lines).  Tlie  dimensions  of  the  anay  aie 
in  figure  9,  where  the  wavelength  corresponds  to  f,,. 
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Figure  1 1  -  Effective  Element  Pattern  of  Dielectric  Slab  Array 


Conclusions  and  Future  Work:  We  combine  the  finite  element  modelling  technique  with  waveguide 
simulator  concept  to  determine  the  electrical  properties  of  periodic  structures  and  then  apply  the 
meUiod  to  a  number  of  interesting  antenna  and  scattering  geometries.  Comparisons  with  results 
obtained  by  other  methods  validate  the  analysis  procedure.  The  method  proves  useful  for  predicting 
the  electromagnetic  properties  of  complex  periodic  sU  uctures. 


The  method  is  not  numerically  efficient  since  the  entire  waveguide  simulator  is  modelled  using  the 
finite  element  technique.  The  alternative  methods  outlined  in  the  paper  were  normally  a  few  orders  of 
magnitude  faster  than  the  FEM  simulator  technique.  Still,  it  is  very  useful  for  validating  other  analysis 
techniques  or  for  testing  design  ideas  and  no  time  is  spent  on  computer  programming.  Further,  it  is  a 
tool  presenUy  accessible  to  engineers  in  most  liigh  tech  firms.  We  presently  use  this  FEM  simulator 
technique  to  verify  all  our  numerical  models  for  periodic  structures  and  normally  obtain  accuracy  far 
beyond  tltat  achievable  by  measurement. 
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Abstract  -  Mapped  infinite  elements  technique  is  described  to  solve  electromagnetic  field  unbounded  problems. 
The  method  is  based  on  the  mapping  of  a  semi-infinite  strip  onto  finite  local  element  using  singular  mapping 
functions.  The  infinite  element  is  used  like  a  boundary  condition,  therefore  without  increasing  the  final  matrix 
dimension,  with  less  calculation  time  and  lower  computation  costs.  Tests  have  been  done  in  2-dimcnsional 
electromagnetic  problems  and  results  have  shown  to  be  quite  good.  Virtual  elements  are  described  and  used  to 
explore  the  entire  domain.  Results  are  presented  based  on  the  solution  of  two  different  electromagnetic  structures. 

INTRODUCTION 

Finite  element  method  (FEM)  is  a  general  technique  for  solving  boundary  value  problems  [1,4,6,13],  Several 
electromagnetic  devices  are  studied  in  free  space,  or  in  unbounded  domains.  The  most  used  technique  is  placing  the 
boundary  'far  enought'  of  the  studied  device,  but  with  computation  costs  [2-4, 13).  Several  techniques  for  modelling 
unbounded  problems  have  been  studied  [3,9,10,14],  These  methods  are  classified  in  global  methods,  in  which  the  exterior 
domain  is  considered  as  one,  such  as  truncation  and  ballooning  [2,3],  and  elementary  methods,  in  which  the  exterior 
domain  is  divided  into  a  finite  number  of  elements,  or  infinite  elements  [8,1 1], 

In  this  paper,  mapped  infinite  elements  technique  is  described  and  some  results  in  two  dimensional 
electromagnetic  problems  are  show.  To  the  visualisation  of  the  exterior  field  the  virtual  element  technique  is  described 
[8]. 


MATHEMATICAL  FORMULATION  AND  APPLICATION  OF  FINITE  ELEMENT 

METHOD 


The  governing  equations  of  electromagnetic  problems  are  the  Maxwell's  equations  [1,4] 


/ 

curl  E  +  =  0 

(1) 

/ 

(2) 

divB  =  0 

(3) 

divB  =  p 

(4) 

With  the  constitutive  relations  of  the  form 


D-f  E 

(5) 

B  =  /iH  +  Br 

(6) 

J  =  ctE 

(7) 
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Coordinate  mapping  between  local  and  global  systems  is  commonly  used  in  FEM.  In  mapped  infinite  element 
technique  a  singular  mapping  function  Mj,  based  on  l/{l-4),  is  used.  With  these  functions  some  nodes  from  local  system 
are  mapped  into  nodes  at  infinity  in  global  system,  and  when  ^  1  =>  A/,  ^  oo . 

A  one  dimensional  element  in  global  coordinate  with  node  3  at  infinity  that  is  mapped  onto  a  parent  element  in 
local  system  -1  ^  ^  ^  +1,  is  considered  in  figure  1. 


Figure  1.  One  dimensional  infinite  element  mapping 
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In  one  dimensional  element  the  mapping  (from  global  x  to  local  ^  coordinate)  can  be  defined  as 


CO 


Figure  2.  Singular  mapping  of  a  global  element  with  nodes  placed  at  infinity  in  local  reference  element. 

The  coupling  of  finite  element  method  with  infinte  element  method  is  made  with  the  increasing  of  a  boundary 
condition  that  defines  an  infinite  boundary  (figure  3). 


Figure  3.  Infinite  elements  coupled  with  finite  element  mesh 


The  final  element  matrix  is  defined  as; 


3:  3rj  3c  \  34  3c  3r]  3c)  1^34  4'  ^  4’ A  ^44^  ^  4^}\ 


Where  |J|  is  the  determinant  of  the  Jacobian,  defined  as: 

dx  dy  dx  dy 


|j|=det[J]  = 


J\d4l7]  (19) 


(20) 


dx\  dx\  di. 


The  field  variable  at  nodes  5  and  6  (figure  2)  are  specified  as  zero  as  boundary  condition  of  the  infinite  element 
and  do  not  appear  at  matrix  Kji  (equation  19)  [6,9], 

THE  VIRTUAL  FINITE  ELEMENTS 


A  good  accuracy  of  solution  is  obtained  with  the  mapped  infinite  element  technique  coupled  with  the  finite 
element  method  explained  above.  But  the  external  field  is  not  obtained  by  the  use  of  this  technique.  To  solve  this  problem, 
a  mesh  with  C®  Lagrange's  elements  [13]  is  created  in  the  external  region  (called  virtual  elements)  shown  in  figure  4  [8]. 
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Figure  4.  Virtual  elements  applied  at  the  finite  element  mesh  coupled  with  the  mapped  infinite  element  technique. 

In  the  virtual  mesh,  a  Dirichlet  boundary  condition  is  imposed,  and  the  values  of  the  state  variable  in  the  finite 
element  mesh  calculated  with  mapped  infinite  elements  is  mantetned. 

APPLICATION 

The  method  has  been  succesfully  applied  in  several  2D  electromagnetic  field  problems.  Two  e.xamples  are 
presented.  In  figure  5(a)  and  figure  7(a)  the  studied  domains  are  presented. 

The  mesh  of  the  domains  are  presented  in  figures  5(b)  and  7(b),  Mapped  infinite  elements  are  used  in  all  the 
boundary  of  figure  5  and  in  the  arc  of  circunference  of  figure  7. 

In  figure  6  and  figure  8.  the  equipotentials  are  presented.  Figures  6. (a)  and  8.  (a)  present  the  equipotential  field 
with  infinite  elements  only.  A  new  region  is  created  and  discretized  in  virtual  elements  and  a  Dirichlet  boundary 
condition  equal  zero  is  imposed  (figure  4).  In  figures  6.(b)  and  8.(b),  infinite  elements  with  virtual  elements  are  used  at 
the  infinite  boundary'  to  show  the  whole  field. 
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Figure  7.  Field  winding  with  an  iron  corc.(a)  Domain  of  study;,  (b)  Finite  element  mesh 


Figure  8. (a)  Infinite  elements  only,  (b)  Infinite  elements  with  virtual  elements 


CONCLUSION 

The  method  described,  with  mapped  infinite  element  used  as  a  boundary  condition  does  not  increase  the  final 
matrix  dimension.  The  number  of  elements  and  nodes  of  the  domain  mesh  are  not  altered.  The  results  obtained  show  that 
Finite  Element  Method  coupled  with  Mapped  Infinite  Elements  technique  gives  better  results  than  with  FEM  only.  In 
electromagnetic  problem  where  the  field  tends  to  zero  at  infinity,  the  use  of  a  technique  like  mapped  infinite  element 
becomes  ver>'  importante.  Virtual  elements  is  a  good  solution  to  obtain  the  external  field  and  to  explore  the  entire  domain. 
The  results  show  that  mapped  infinite  element  technique  can  solve  unbounded  problems  with  good  accuracy. 
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Abstract 

The  clcclromagnelic  modeling  of  large  cavities  has  been  an  active  resetirch  topic  for  sevend  years.  Recently,  eftorts  at 
McDonnell  Douglas  Aerospace  (MDA)  have  focused  on  upgrades  to  the  CAVERN  (Cavity  Electromagnetic  Analysis)  code. 
Our  approach  separates  tlie  scattering  from  jet  engine  cavities  into  two  principal  physical  mechanisms.  One  involves  the  initial¬ 
ization  and  propagation  of  energy  down  the  duct.  The  other  is  tlie  scattering  trom  complicated  engine  face  configurations. 

An  important  benefit  of  this  approach  is  tliat  Uie  exterior  moldline  or  tlie  duct  configuration  can  change,  but  the  engine  analy¬ 
sis  need  not  be  recomputed.  Demonstriibly,  this  is  more  efficient  than  using  Uie  same  technique  for  die  extenud  airfr:imc,  for 
propagation  down  the  duct,  and  for  die  engine  face.  For  complex  large  inlets  CAVERN  uses  die  shooting  tuid  bouncing  ray 
(SBR)  technique  to  trace  rays  to  an  area  nem  the  engine  face  where  the  geometrical  cross  section  typicidly  becomes  uniform 
and  cylindrical  in  nature.  A  termination  aperture  is  defined,  and  die  rays  across  diis  plane  are  decomposed  into  equivalent 
modes.  A  East  f'ourier  Bessel  I’nmsform  (Fl’BT)  is  incorporated  in  CAVERN  foropdmal  efficiency  for  die  ray-modal  conver¬ 
sion. 

A  modal  rellection  coefficient  matrix  is  used  to  characterize  die  engine  face  region  and  provide  outgoing  modal  functions 
at  die  termination  aperture.  Application  of  the  reciprocity  dieorem  allows  computation  of  die  radar  cross  section  using  an  inte- 
gr;d  evaluated  across  diis  plane  without  requiring  tracking  die  rays  to  die  physical  aperture.  For  a  given  frequency,  this  modal 
niediodology  is  two  times  faster  dum  using  die  standard  SBR  approach  for  propagation  both  into  andoutof  an  inlet.  The  solution 
widi  FEB  r  modtd  decompositon  for  conversion  of  rays  into  modes  is  several  orders  of  magnitude  faster  dian  direct  integration. 

More  accurate  tuiidyses  capturing  liigher  order  mechanisms  ctui  be  used  when  separately  characterizing  the  terminadon 
region  of  die  duct  at  the  engine  face.  The  methods  of  choice  in  die  termination  scattering  region  are  exact  low  frequency  methods 
such  as  finite  element.  However,  dicse  meduxis  arc  limited  to  electrical  lengths  much  smaller  than  typically  required  for  full 
sized  air  vehicles  at  X-bmid  in  tenns  of  storage  and  computational  times.  Presendy,  physical  oplics/physical  dieory  of  dilfrac- 
tion  (POd’TD)  is  used  for  die  termination  matrix,  along  widi  plane  wave  decomposition  of  die  ray  optic  field  in  die  termination 
area.  This  is  a  cooperative  effort  with  Ohio  State  University.  Conceptually,  calculation  of  a  plane  wave  reflection  coefficient 
matrix  is  much  simpler  diiui  die  creation  of  the  modal  matrix.  A  physical  optics  code  such  as  CADDSCAT  is  used  to  generate 
a  plane  wave  matrix  for  modeling  die  engine  face.  This  matrix  is  dien  converted  to  a  modal  matrix  and  stored  for  subsequent 
use  in  CAVERN  when  scattering  from  a  specified  engine  is  required.  The  termination  matrix  methodology  streamlines  calcula¬ 
tions  for  large  jet  engine  inlets.  I’liis  technique  has  been  utilized  on  a  variety  of  geometries.  Future  work  will  examine  die  range 
of  validity  of  die  PO/P'l'D  solution  for  die  generation  of  the  termination  matrix. 

I.  Introduction 

Faigine  inlets  in  aircraft  are  a  primtiry  contributor  to  die  radio-frequency  signature  for  certain  aspect  ranges,  and  it  is  impera¬ 
tive  to  provide  not  only  die  most  accurate  but  also  die  most  efficient  methods  for  computing  the  radar  cross  section  (RCS).  It 
is  attractive  to  sepmate  the  exterior  and  interior  scattering  for  generation  of  RCS  .  Widi  diis  approach,  die  engine  face  problem 
can  be  analyzed  :md  results  accessed  septirately.  Furthermore,  more  accurate  analyses  capturing  higher  order  scattering  mecha¬ 
nisms  are  used  when  septtrately  characterizing  the  engine  region.  Figure  1  summarizes  die  separation  ot  dicjetcngiiic  scattering 
problem  into  three  areas  of  major  focus.  The  first  involves  physical  aperture  coupling  to  external  features.  This  includes  multi¬ 
ple  bounce  and  shadowingeffects  of  the  external  features  with  respect  to  die  aperture.  This  could  be  accomplished  wiUi  General¬ 
ized  Ray  Expansion  (GRE)  or  SBR.  SBR  was  used  for  the  hybrid  termination  analysis  included  here.  The  second  area  is  the 
temiiiiation  aperture,  involving  modal  decomposition  or  plane  wave  expansions.  1  he  third  area  is  die  termination  scattering 
region.  The  mediods  of  choice  here  involve  exact  low  frequency  methods  such  as  finite  element.  These  mediods  are  not  quite 
ready  for  complex  terminations  of  40  wavelength  ditimeter  in  terms  of  storage  or  computational  times,  and  physical  optics/ 
physical  dieory  of  diffraction  (PO/PdlJ)  can  also  be  used  for  die  termiiialion  matrix. 
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Figure  1.  Areas  of  Major  Focus 
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Figure  2.  Geometry  of  the  Inlet 
Scattering  Problem 


CAJTDSCAT’annlysis  is  iiseJ  togeneratc  a  plane  wave  tennination  matrix  using  PO/P  I'D  for  modeling  engine  face  complex¬ 
ity  (Section  II).  This  matrix  is  then  converted  to  ;i  modal  tennination  matrix  (Section  Ilf),  tmd  stored  off  for  usage  when  scatter¬ 
ing  from  a  specified  engine  must  bo  tidded  to  external  tiircraft  effects.  Calculations  using  tlic  new  technique  are  coinptired  wiili 
available  data  for  cylindrical  ctivities  terminated  widi  Hat  plates,  hemispherictil  hubs  and  curved  hhide  geometries  (Section  IV). 

II.  Modal  Decomposition  and  Plane  Wave  Termination  Reflection  Matrix 

CAVERN  uses  standard  ray  analysis  combined  with  a  modal  tmalysis  near  the  temiination  region  to  calculate  RCS.  Rays 
arc  initiated  in  die  aperture  plane  ;ind  traced  to  ;ui  area  near  die  back  of  die  cavity  where  the  geomelrical  cross  section  becomes 
uniformly  cylindrictil.  The  ray  representation  of  tlte  electric  field  is  exptuided  in  terms  of  tlie  known  modal  functions.  An 
application  of  die  reciprocity  theorem  of  electromagnetics  allows  computation  of  RCS  using  ti  reaction  integral.  (Ref  1 ) 

■fracking  rays  bodi  into  and  out  of  the  cavity  is  subject  to  errors  ass(x:iaied  widi  die  geometrictil  optics  approxinuition.  Tor 
realistic  sized  ducts  wiUi  lengtJi  to  ditimcter  ratios  of  >3: 1 ,  Uic  validity  of  standtud  ray  approaches  may  break  down  completely. 
CAVERN’S  soludoii  to  diis  problem  has  been  the  use  of  both  ray  and  modal  analyses  with  a  rcneclion  coefficient  matrix  repre¬ 
senting  die  engine  face,  'flie  reciprocity  algoridim  is  used  for  both  efficiency  and  accuracy  so  dial  rays  do  not  have  to  be  traced 
back  to  die  entrance  of  die  cavity.  For  a  variety  of  ducts,  MDA's  CAVERN  code  provides  more  fidelity  dian  is  possible  widi 
odier  methods. 

fhe  unique  features  of  CAVERN,  combining  die  ray  tuid  modal  solutions  and  the  calculation  of  RCS  in  conjunction  with 
die  reaction  integral  based  on  die  reciprocity  theorem  were  discussed  in  Ref  2.  The  temiination  relleclion  coefficients  matrix 
can  be  found  with  finite  element  mctliods  under  development.  (Ref.  3)  In  this  section,  we  focus  on  the  niockil  decomposition, 
as  well  as  plane  wave  decomposition  and  PO/I^1T  tennination  mtitrices,  which  arc  being  pursued  cooperali  vely  widi  OSU  (Ref. 
4)  tuid  are  discussed  in  die  next  sections. 

Figure  2  shows  the  geometry  definitions  necessary  for  an  application  of  die  reciprocity  dicorem.  As  discussed  in  Ref  2, 
we  note  dial  die  equations  governing  die  conversion  of  rays  to  modes  can  be  compactly  fonnulated.  The  radial  integral  can 
be  written  as  linear  combinations  of 


F{H)  =po(p,0)  =  2hP 


rllr,  e)Jn(2Ti:p  r)dr 


(1) 


where  p  =  pq/2Tca 

widi  pq  equal  to  die  roots  of  die  Bessel  function  or  its  derivative,  tuid  where  Jn  (x)  is  the  Bessel  function. 
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Applying  the  FFBT  to  tliis  integral ,  we  use  the  technique  discussed  in  Ref.  5.  The  non-uniform  ray  information  must  be 
tratisformcd  to  an  exponential  grid  in  order  to  use  the  procedures.  The  FFBT  method  translates  Ure  problem  into  three  simple 
FFT's,  resulting  in  more  Hum  an  order  of  magnitude  improvement  in  runtime.  Performing  tiie  theta  integral  first  witJi  FT  Is  pro- 
vides’all  the  modal  information  for  n  at  once  and  is  a  more  efneieut  way  to  evaluate  the  integral  of  Equation  1 .  The  gencrauon 
of  the  grid  is  not  very  expensive  iuid  the  problem  is  simply  reduced  to  tlie  calculation  of  many  FTTs. 

In  die  plane  wave  decomposition  technique,  rays  rtre  transformed  into  discrete  pUuie  waves.  There  are  both  incoming  and 
outgoing  plane  waves.  These  outgoing  plane  waves  arc  related  to  tlie  incoming  plane  waves  tlirough  a  plane  wave  termination 
rcllection  matrix.  Figure  3  illustrates  tlie  plane  wave  exptmsion.  The  geometry  of  the  engine  area  Irom  the  termination  aperture 
to  the  back  of  lire  blade  area  is  modeled  witJi  incident  plane  waves,  ITc  plane  wave  matrix  is  generated  bistatically.  Each  inci¬ 
dent  angle  has  associated  with  it  an  entire  spectrum  of  observation  migles.  Thus,  a  pLme  wave  matrix  is  generated.  For  large 
bodies,  it  is  expected  tJiai  a  plane  wave  matrix  generated  wiili  PO/PTD  will  capture  essential  scattering  mechanisms  lor  the 
termination  area.  I’he  plane  wave  termination  matrix  is  easily  generated  with  CADDSCAl,  using  the  equation  shown  in  Figure 
4.  llerej  represents  tlie  incident  plane  wave,  i  is  the  reflected  plane  wave,  mid  a  is  tlie  radius  of  tlie  circular  area  of  die  termination 
aperture. 
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Figure  3.  Plane  Wave  Expansion 
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Figure  4.  Generation  of  Plane  Wave  Matrix 

Equation  51  of  Ref  7 


To  accurately  mialyzc  a  complex  engine  face,  exact  metJiods  are  attractive.  The  authors  of  Ref.  4  used  a  modified  MDA 
code  (ClCfiRO)  uUlizing  method  of  moments  (MM)  for  bodies  of  revolution  to  calculate  a  plane  wave  scattering  matrix.  This 
scattering  matrix  for  die  termination  region  is  dicn  used  with  eidieramodal,  SBR,  or GRI- solution  in  the  front  section.  Detailed 
resultson  these  methods  arc  included  in  Ref  4.  Exact  metliods  will  work  well  forrotationally  symmetric  terminations  of  diame¬ 
ters  less  Uum  ~  lOX.  However,  these  methods  presently  remain  a  computational  challenge,  and  are  being  investigated  at  the 
University  of  Michigan  (Ref.  3),  and  OSIJ.  In  Uiis  paper,  we  are  using  the  method  outlined  in  Refs.  6, 7  for  calculating  die  plane 
wave  termination  matrix  for  an  arbitrary  engine  face  witli  PfT/FFD. 


An  example  fora  Hat  plate  termination  is  described  below.  Eor  the  case  of  a  Bat  disk  wiUi  die  termination  aperture  coincident 
with  die  back  plate,  we  expect  die  area  (A)  to  be  rta^.  Using  the  PC  formula  for  a  Bat  plate  at  iionnal  incidence,  we  have; 


,  4jr{jra2)‘ 


(2) 


We  know  diat 


a  =  lim  4jtr“ 


(3) 
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("ornbiniiig  these  two  equations,  we  IlncJ: 


(relative  .scallcrcd  field)  =  lim  r  (H^c.-u/Ri„c)  =  (4) 

r~+oo  K 

Using  llic  equation  in  Figure  4,  we  obtain: 

Irl^Wj  ^  ....  Jt  /Tt^\  ^  Tt 

l  y!  2^a2\  ;  4 

The  maximum  number  of  plane  waves  is  detenniiied  by  tlie  inequality 

m,,^+  n,f<  (ka  sin  (6) 

w’here  6„,ax  is  tiie  maximum  plane  wave  propagation  angle,  usually  80'^  not  to  exeeed  ‘JO’  (Ref.  6).  In  the  case  of  a  two  X  diame¬ 
ter  disk,  shown  in  Figure  5,  we  obtain  nine  plane  waves  (per  each  poliaization)  to  satisfy  the  above  relationship.  'I'hc  “-7t/4" 
elements  :ire  the  theoretically  derived  values,  based  on  a  scattered  field  of  xca^/J.  as  discussed  above  (Equation  4).  In  Figure 
6,  we  show' a  seefiott  of  the  matrix  calculated  by  CADDSCJAF  corresponding  to  the  idealized  case.  Thephuie  wavcruigles  were 
generated  according  to  Ref.  7. 

Table  I  lists  the  size  of  die  matrices  expected  for  various  engine  face  radii.  The  matrices  are  large,  increasing  approximately 
as  a"^.  TJte  numbers  in  the  table  do  not  refiect  a  factor  of  two  to  account  for  both  vertietd  and  horizontal  polarizations.  I  ypically, 
tigliler  inlets  have  a  radius  of  approximately  20  wavelengths  at  X-band. 

+  0^2  <  (2  a  sin  Q0°/k)^  =  3.84  AZ,  EL  AZ,  EL 

n  =  -1,  m  =  -1,0, +1  he  (m,  n)  Ref  (m,  n)  Index  Inc  |nc  Ref  Ref  i'Pw 

n  =  0,  m  =  -1,0,-r1  (deg)  (deg)  (deg)  (deg)  _ 

n  =  +1  m  =  -1  0+1  _ _  1001  35.26  30.00  35.26  30.00  (Small) 


35.26  30.00  30.00 
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Figure  5.  Plane  Wave  Matrix  Example  for 
2X  Diameter  Disk 

Flat  Plate  Termination 


8  Figure  6.  Plane  Wave  Format  for 

2k  Diameter  Disk 

TABLE  1.  PLANE  WAVE  TERMINATION  MATRIX  SIZE 


a(>.) 

Number  of  Plane  Waves 
(NPW) 

1,0 

9 

2.5 

69 

5.0 

301 

8.3 

853 

(size  of  PW matrix)  =  (NPW)  ‘(NEW) 
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111.  Conversions  Between  Modes  and  Plane  Waves 

In  this  section,  the  conversion  ol'phuie  wave  matrices  generated  by  C  ADDSC  AT  to  modal  matrices  is  discussed,  The  modal 
decomposiuon  used  in  CAVERN  subroutines  is  tlien  interfaced  to  tJiesc  matrices.  C(X>peraiive  work  is  also  ongoing  to  interface 
University  of  Michigmi  modal  matrices  directly  with  CAVERN. 

The  conversion  of  plane  wave  mtitrices  to  modal  matrices  is  done  using  a  modification  of  Ref.  7 .  This  conversion  is  shown 
in  Figure?  using  the  transfer  matricesof  Ref.  7.  (3nce  the  plane  wave  termination  matrix  is  found,  the  modal  matrix  (E  ’} 
is  calculated  by  matrix  multiplication  of  die  transfer  matrices.  Matrix  [  U]  translates  coctficiciiLs  of  incident  modal  functions 
to  coefficients  of  incident  plane  wave  functions.  The  refers  to  plane  waves  traveling  in  die  +y  direction,  widi  incidence 
toward  the  complex  termination  tmd  refers  to  travel  rellected  from  die  complex  lemiination. 

Figure  8  shows  die  necessary  process.  Conversion  between  plane  wave  and  niodtil  termination  matrices  is  dircedy  accom¬ 
plished  witli  the  PW2MODI  program.  Willi  the  MDA  code  structure,  generation  of  die  plane  wave  matrix  could  be  accom¬ 
plished  willi  bisuitic  finite  clement  rc.sults  or  test  data. 

For  a  simple  fiat  plate  disk  wiUi  a  two  I  diameter,  die  plane  wave  mau-ix  generated  with  CADDSCAT  has  dominant  terms 
of  -Tr/4  and  die  oUicr  tenns  are  small  tus  was  shown  in  Figure  6.  When  the  plane  wave  matrix  is  converted  to  modes  widi  the 
PW2MODI  code,  die  diagonal  tenns  for  die  converted  modal  matrix  tire  close  to  -1  for  die  most  part,  but  diey  may  not  agree 
widi  more  exact  methods  for  liiis  sniidl-sizcd  geometry.  I-'or  die  two  1  diameter  case,  diese  matrices  are  m  close  agreement 
w'idi  results  from  the  OSU  subroutines  discussed  in  Ref.  7. 
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•  Two  Routines  Combined 

-  Post-Processor  for  CADDSCAT  and  PW2MODI 
Combined  Into  One  Code 

•  Fites  Written  to  Disk  Now  Stored  in  Binary 

-  Reduces  Storage  by  a  Factor  of  Three 
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Figure  8.  Engine  Face  Analysis  Process 
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Figure?.  Conversion  of  Plane  Wave  to 
Modal  Matrix 

P  =  Maximum  Number  of  Modes,  Q  -  Maximum 
Number  of  Plane  Waves 
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rv.  CAVERN  Results  with  Plane  Wave  Matrices  and  Modal  Conversion 

All  inilial  validation  case  for  ClAVERN  using  tlic  new  method  was  for  a  cylinder  widi  a  tint  plate  tenninalion  having  tlie 
same  diiuneteras  (lie  F-l?  engine  .  Tliis  Im  by  3m  cylinder  was  analyzed  at  3.0011/  (equivalent  to  a  10  wavelength  diameter). 
CADDSf^AT  PO  was  used  to  generate  die  plane  wave  matrix  for  die  termination  aperture  plane,  which  coincided  with  the  cavity 
hack  plate.  P\V2MODI  was  used  to  convert  the  pliuie  wave  matrix  to  a  modal  matrix,  needed  as  input  to  CAVERN.  We  obtained 
very  close  agreement  between  C  AVI  iRN ’s  CW  TERM  results  (using  - 1  's  along  the  diagonal)  and  the  new  plane  wave  expmision/ 
modal  conversion  mediod. 

Realistic  geometries  do  not  allow  us  to  have  die  lerminatioii  pkme  at  die  back  plate.  To  validate  the  mediods  being  used, 
w’e  perfomied  a  comparison  of  our  results  with  diosc  of  OSU.  1  his  compiaison  is  shown  in  lagure  9  fora  termination  aperture 
placed  10"  in  front  of  die  back  plate.  In  the  OSU  results,  SBR  is  used  in  die  first  section  of  the  duct,  'flic  plane  wave  expiuision 
is  used  to  couple  the  duct  to  the  termination,  and  the  plane  wave  termination  matrix  was  calculated  using  im  SBR  approach. 
The  lemiination  section  is  10"  in  length  as  shown.  In  the  CAVERN  plot,  SBR  is  used  in  die  duct  section  and  a  plane  wave  ter¬ 
mination  was  calculated  using  an  SBR  approach.  PW2MODI  was  used  to  calculate  a  modal  termination  matrix  which  was  tlien 
read  into  CAVERN.  Tliis  plot  provides  a  comparison  of  the  RCS  and  allows  us  to  evjduate  the  usage  of  die  modal  and  plane 
wave  decomposition  techniques.  'I'lie  results  are  relatively  insensitive  to  the  placement  of  the  termination  aperture. 


Figure  9.  Comparison  Between  CAVERN  and  OSU  Results  for  10  Inch  Section  of  Termination 

Frequency  =  3  GHz,  Elevation  =  0° 
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CAVrRNrosulHu.in8U)cSni!/modcs/SBRm<:ll.odMcshowni.il4gurelOfotaductw,ll,ahcmisphcricoJhubleraimalion^ 

rL  .a  Show,,  lahclcd  -modcic-  ""d 


rnmn-trison  resu  ts  tromUMJ  lircsnown  ia[)cii;u  iuuui,n„  onv.  --....vv,  -  in 

m3  sXtion  is  used  in  tlic  2m  duct  section  and  die  MDA  MM  code  CICHRO  for  the  0.5m  lemimatiou  region.  (Ref.  4)  n 
die  OSU  result  (“sbrcic")  the  2m  duct  section  is  analyzed  with  SBR.  Clearly,  the  results  differ  based  on  whel^ier  approxma  c 
nv  o^xln  ^Js^tions  are  used  in  the  front  duct  section.  The  SBR/phme  wavc/MM  resu  t  ( iabelled  ^breO  is^m  Uu 
good  agreement  with  the  CAVl  v RN  result  and  is  consistent  with  the  level  of  agreement  for 
waves/MM  and  SBRyplaiie  waves/SBR  for  Bat  plate  Im  diameter  by  2.5m  length  cylinders  at  2.4  OH/..  (Ret.  4) 


Resultsforthetwistedbladeconrigurations,similartothatshowninRgarell,arebeing8eneratedwidilJniversily^ 

limited  mode  matching  transmission  line,  mid  finite  element  metliods  for  a  vancty  ot  trequencies.  The  imxla!  matrix 
me  the  blade  sections  widi  die  hub  and  also  die  cylindrical  duct ,  lustrated  m  Figure 

twisted  blades  diat  are  each  4°  wide.  The  blades  are  terminated  directly  m  a  short.  The  CAVERN  RCS  lor  this  siiown 
n  I  gure  1  It  ;.ll  be  compared  to  mute  eiement  results  when  available  from  the  University  of  Michigan  and  OSU. 


Figure  11.  RCS  Results 

Frequency  =  3  GHz,  Elevation  =  0^  5A  Radius 


GP54.O103-11-VB 
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V.  Stimmary  Discussion 

I  lie  major  tJirusls  of  this  effort  are  summari/.cd  below: 

1,  Fast  Fourier  Bessel  Transform  (FFBT)  Techniques  for  the  mixlal  decomposition  algorithm  were  incorporated  into 
CAVIiRN 

2.  generation  of  the  plane  wave  termination  matrix  using  the  PO/B  TD  approach  of  CADDSFAT  has  been  accomplished, 

3.  codes  to  convert  tlie  plane  wave  matrix  to  a  modal  termination  matrix  were  written,  and 

4,  use  of  ,SBR  for  duct  propagation,  coupled  witli  modal  decomposition,  and  pltuie  wave  termination  nnitrix  generation  Ikis 
been  compared  to  available  data  for  botli  flat  plate  and  complex  hub/bhtde  terminations, 
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1.  Abstract 

Shooting  and  Bouncing  Rays  (SBR)  is  a  high-frequency  technique  for  attacking  complex  S-D 
scattering  problems.  It  has  been  implemented  in  a  general-purpose  radar  signature  computer  code 
'Xpatch'  and  the  code  is  widely  used  in  the  RCS  community  [1-2],  SBR  so  far  has  been  applied 
to  scatterers  made  of  conductor  or  conductor  coated  with  thin  material.  In  this  paper,  we  extend 
SBR  to  bulk  material,  and  present  several  test  cases  involving  cavities  loaded  with  bulk  dielectric. 
Our  SBR  results  generally  are  in  good  agreement  with  the  exact  solution  calculated  by  method  ot 
moments  (MoM). 


2.  Introduction 

A  difficult  problem  in  performing  electromagnetic  scattering  computations  is  predicting  the 
scattering  from  electrically  large,  realistic  objects  with  bulk  materials.  Low-frequency  methods, 
such  as  MoM,  Finite  Element  Method,  and  Finite  Difference  Time  Domain,  are  well  suited  for 
handling  problems  involving  materials,  but  are  often  limited  by  electrically  larp  geometries,  this 
is  due  to  the  large  number  of  unknowns  needed  to  accurately  represent  the  problem.  High- 
frequency  methods,  such  as  SBR,  Geometrical  Theory  of  Diffraction,  and  Physical  Theory  o 
Diffraction,  are  good  for  predicting  the  scattering  from  these  geometries,  but  generality  do  not  dea 
with  bulk  materials  as  well.  A  method  for  doing  high-frequency  calculations  for  a  general 
geometry  containing  bulk  materials  and  using  SBR  is  presented  in  this  paper. 


3.  Formulation 

The  method  used  by  Xpatch  to  compute  the  scattered  far-fields  from  a  target  consists  of  three  steps. 
(1)  Calculating  the  scattered  far-fields  for  the  first  bounce  response  over  the  lit  surfaces  using 
physical  optics  (PO);  (2)  shooting  a  grid  of  parallel  rays  representing  a  plane  wave  towards  the 
target,  tracing  each  ray  through  the  target,  then  performing  a  PO  type  integration  at  the  ray  s  last 
hitpoint;  and  (3)  adding  the  first-order  edge-diffraction  from  all  conducting  edges.  Our  proposed 
method  for  using  bulk  materials  expands  on  step  (2)  above. 


At  each  ray  hitpoint  on  a  bulk  material,  Snell’s  Law  is  used  to  compute  both  the  mflected  and 
transmitted  ray  directions,  and  a  new  ray  is  spawned  in  that  transmitted  direction.  The  complex 
fields  for  each  ray  are  computed  using  the  incident  fields  and  the  Fresnel  reflection  and 
transmission  coefficients  for  the  given  interface.  These  rays  are  then  traced  until  the  ray  either  exits 


nil 


the  target,  or  reaches  a  predetermined  number  of  bounces.  For  those  rays  that  do  exit  the  target,  a 
PO  type  integration  is  performed  at  that  ray's  last  hitpoint  to  determine  the  backscattered  far-fields. 

This  proposed  method  is  versatile,  because  it  can  in  theory  handle  geometries  containing  bulk 
materials,  conducting  surfaces,  and  thin  material  surfaces  at  the  same  time.  Preliminary  results 
indicate  that  the  method  works  better  for  lossless  cases  than  for  lossy  cases,  due  to  the  fact  that  the 
direction  of  constant  phase  in  a  lossy  material  is  different  from  the  direction  of  the  energy  flow. 


4.  Results 

We  have  implemented  the  SBR  bulk  material  method  in  Xpatch.  We  will  now  present  a 
comparison  of  results  using  Xpatch  and  a  2-D  MoM  code  [3]  for  some  simple  geometries.  Since 
Xpatch  is  a  3-D  code,  the  conversion 

RCS,t,.  =  REW„„,  +  IOIog?^ 

is  used  to  convert  from  2-D  Radar  Echo  Width  results  to  3-D  Radar  Cro.ss  Section  results.  In  the 
above,  /  represents  the  length  of  the  object  in  the  z-direction  and  X.  is  the  wavelength  in  meters.  All 
geometries  used  with  Xpatch  can  then  be  created  uniform  in  z,  and  the  problem  becomes  2- 
dimensional,  with  the  backscattering  only  computed  over  an  azimuthal  sweep.  In  all  cases,  the  E- 
field  vector  points  out  of  the  page. 

4.1  Bulk  Cube 

The  first  geometry  tested  is  a  simple  square  cube  of  bulk  material,  using  varying  dielectric 
constants.  The  dimensions  tire  6.67A,  on  a  side,  and  the  backscattered  fields  are  computed  from  0 
to  90  degrees  due  to  symmetry.  For  this  case,  the  bulk  material  used  has  a  relative  permittivity  of 
2.0  -  j  1 .0  and  a  relative  permeability  of  1 .0  +  jO.O.  The  plot  comparing  the  MoM  results  with  the 
SBR  results  is  shown  in  Figure  1. 

4.2  Bulk  Cube  with  Conducting  Sides 

The  next  set  of  geometries  deal  with  the  backscattering  from  lossless  bulk  materials  with 
conducting  walls.  This  repre.sents  a  more  general  problem,  incorporating  both  bulk  materials  and 
conductors.  The  first  ca.se  is  that  of  a  lossless  material  with  a  dielectric  constant  of  9.0  -t- jO.O  and 
one  conducting  side,  and  the  results  comparing  the  MoM  solution  with  that  of  the  SBR  are  shown 
in  Figure  2. 

The  next  case  is  .similar  to  the  above,  except  that  now  we  use  conductor  on  two  sides  of  the 
material,  and  a  different  dielectric.  For  this  ca.se,  an  azimuthal  sweep  from  0  to  360  degrees  is 
performed,  using  a  material  with  a  relative  permittivity  of  4.0  +  JO.O.  The  results  are  shown  in 
Figure  3. 

A  case  with  three  conducting  sides  is  next,  using  a  bulk  dielectric  with  a  relative  permittivity  of  2.0 
+  jO.O.  The  results  of  an  azimuthal  sweep  from  0  to  360  degrees  are  shown  in  Figure  4. 

In  all  three  ca.ses,  good  agreement  is  shown  between  the  MoM  and  the  SBR  results  for  most 
points.  This  is  encouraging,  showing  that  the  method  is  able  to  handle  different  lossless  dielectrics 
and  conducting  surfaces  simultaneously. 
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4.3  Bulk  Material  Inside  a  Cavity 


The  next  step  is  to  verify  that  the  proposed  method  can  work  for  a  more  complicated  geometry,  that 
of  a  cavity  with  a  smaller  bulk  cube  inside  it.  For  this  case,  we  use  a  10>.  x  10?l  cavity  loaded  with 
a  diagonally  placed  2.36X  dielectric  cube,  centered  within  the  cavity.  The  relative  permittivity  of 
the  material  is  2.0  +  jO.O.  This  represents  a  significant  difference  in  the  geometry,  due  to  the  fact 
that  there  are  now  air  gaps  between  the  conductor  and  the  bulk  material,  and  direct  interactions 
between  rays  entering,  leaving,  and  re-entering  the  bulk  material  are  present.  Again  we  find  good 
comparison  between  the  MoM  results  and  the  Xpatch  predicted  results,  as  shown  in  Figure  5. 

4.4  Bulk  Loaded  Jet  Engine 

For  this  case,  range  profiles  are  computed  for  two  different  aircraft  inlet  structures.  The  first  is  a 
let  engine  inlet  made  of  only  conducting  surfaces.  The  second  is  the  same  jet  engine  inlet,  except 
that  a  large  dielectric  block  has  been  placed  inside  the  inlet  near  the  engine  blades.  A  side-view  ot 
the  proposed  geometry  is  shown  in  Figures  6  and  7.  It  should  be  noted  that  this  is  done  tor 
illustrative  purposes  only  -  this  would,  of  course,  not  be  done  in  practice.  The  two  range  protiles 
are  then  compared  in  Figure  8,  where  the  effects  of  the  bulk  dielectric  material  can  be  seen. 


5.  Conclusions 

In  this  paper,  we  have  demonstrated  that  SBR  works  well  for  some  simple  geometries  inyol^ving 
bulk  material.  It  remains  to  verify  that  similar  accuracy  can  be  obtained  for  more  general  2-D  and 
3-D  cases. 
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Figure  I .  Scattering  from  a  bulk  material  cube  with  Gr  =  (2.0,- 1 ,0),  fir  =  ( 1  -0,0.0). 
The  E-field  is  directed  out  of  the  page. 


Figure  2.  Scattering  from  a  bulk  cube  with  Er  =  (9.0, 0.0),  )J.r  =  (1.0, 0.0)  and 
conductor  on  one  side.  The  E-field  is  directed  out  of  the  page. 
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Figure  5.  Scattering  from  a  2.36A  bulk  cube  inside  a  10^  x  lOX  conducting  cavity. 
The  bulk  material  has  Er  =  (2.0, 0.0),  [ip  =  ( 1. 0,0.0).  The  E-field  is  directed  out  of 
the  page. 


ilH 


Figure  6:  View  of  the  complete  jet  inlet  from  Azimuth  =  -90°,  Elevation  =  0°.  Both 
objects  are  to  the  same  scale,  and  show  the  relative  position  of  the  blades  and  hubs 
inside  the  inlet. 


Range  Profile  (dBsm)  Range  Profile  (dBsm) 


Down  Range  (inches) 


Down  Range  (inches) 

Figure  8.  Range  profiles  for  the  two  inlets,  using  only  those  rays  entering  the  inlet 
mouth.  The  top  figure  shows  a  range  profile  for  the  empty  inlet,  and  the  bottom 
figure  shows  the  range  profile  for  the  loaded  inlet.  The  differences  between  the  two 
can  be  seen  in  both  magnitude  and  in  the  late-time  response. 
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XPATCH  SIMULATION  OF  LARGE  INLET  STRUCTURES 
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1.  INTRODUCTION 

Xpatch  is  a  general-purpose  high-frequency  radar  signature  prediction  code  based  on 
the  shooting  and  bouncing  ray  (SBR)  technique  [l]-[3].  In  this  work,  we  present  the  signature 
simulation  results  for  large  inlet  structures  using  Xpatch.  First,  the  accuracy  of  Xpatch  is 
demonstrated  through  comparisons  with  the  modal  approach  for  canonical  duct  configurations. 
It  is  shown  that  the  accuracy  of  Xpatch  for  typical  engine  intake  structures  (whth  1  m  opening 
and  8  m  depth)  remains  valid  from  X-band  down  into  the  S-band  regime.  In  addition,  a  fast 
scheme  for  generating  the  2-D  ISAR  (inverse  synthetic  aperture  radar)  imageries  of  targets  is 
described.  This  is  achieved  by  deriving  a  closed  form  image-domain  ray  spread  function  and 
carrying  out  the  image  update  using  the  fast  ray  summation  scheme  of  Sullivan  |4J.  The 
utilization  of  the  fast  scheme  reduces  the  signature  prediction  time  to  only  the  geometrical  ray 
tracing  time.  Finally,  we  present  the  simulation  methodology  and  prediction  results  for  inlet 
structures  containing  rotating  compressor  blades  based  on  Xpatch  simulation,  The  jet  engine 
modulation  (JEM)  phenomenon  can  be  clearly  identified  in  both  the  Doppler  spectrum  and 
ISAR  imagery. 


2.  XPATCHl  AND  XPATCH3 

In  the  Xpatch  package,  there  are  two  separate  codes  for  radar  signature  computation, 
namely,  Xpatchl  tindXpatchS.  Both  codes  contain  the  same  ray  tracer.  However,  the  field 
calculations  are  carried  out  in  the  frequency  domain  in  Xpatchl  and  in  the  time  domain  in 
XpatchS.  Below  we  describe  an  approximate  but  extremely  fast  method  of  generating  the 
ISAR  imageries  of  targets  in  Xpatch3  that  we  have  implemented  recently.  Contrary  to  the 
Xpatchl  approach  where  the  ISAR  image  is  obtained  by  inverse  Fourier  transforming  the 
predicted  scattered  field  data  over  frequency  and  aspect,  Xparc/U  uses  an  image  domain  SBR 
formulation.  We  shall  represent  the  Xpatchl  ISAR  image  by  the  expression: 

Image(x,z)  =  F.T.'^  {  e1(co,0)  } 

i  rays 

where  the  quantity  in  the  parentheses  is  the  total  scattered  field  at  frequency  o)  and  aspect  6  and 
is  obtained  through  the  summation  over  all  exit  rays.  The  x  and  z  variables  represent 
respectively  the  cross  range  and  down  range  coordinates.  By  interchanging  the  order  of  the 
inverse  Fourier  transform  and  the  ray  summation,  this  term  can  be  detennined  in  closed  form 
under  the  small  angle  approximation  [5],  and  the  resulting  image  is  explicitly  expressed  as: 

Image(x,z)  =  ^  aj  h(x  -  x;  ,  z  -  z,)  ^2) 

i  rays 
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where 

h(x,z)  =  sinc(Ak  z)  sinc(  koA6  x  ) 

is  the  "ray  spread  function,"  Ak  and  a0  are  the  bandwidth  and  angular  widths  associated  with 
the  image. 

To  speed  up  the  computation  of  the  ray  sum,  a  scheme  proposed  by  T.  D.  Sullivan  [4] 
can  be  utilized.  The  application  of  the  fast  scheme  is  based  on  the  observation  that  we  can 
recast  (2)  into  a  convolution  between  a  weighted  impulse  train  and  the  ray  spread  function: 

Image(x,z)  =  [  X  5(x  -  Xj)  5(z  -  zj)  ]  *  h(x,  z)  (3) 

Rather  than  performing  the  direct  convolution,  we  take  advantage  of  the  FFT  algorithm.  The 
problem  associated  with  taking  the  FFT  of  the  weighted  impulses  which  do  not  occur  on  a 
uniformly  sampled  grid  is  overcome  by  using  an  interpolation  scheme  [6].  The  table  below 
compares  the  breakdown  of  the  computation  time  for  ISAR  image  formation  for  a  full-size 
aircraft  at  X-band  using  direct  convolution  and  the  Sullivan  scheme.  The  timing  is  done  on  a 
Silicon  Graphics  Indigo  R4000  workstation.  The  time  to  perform  the  convolution  using  the 
Sullivan  scheme  takes  1  min.  as  compared  to  3.2  hrs.  by  direct  convolution,  a  speed  gam  of  a 
factor  of  180.  It  is  evident  that  the  total  image  simulation  time  of  40  min.  is  just  the  ray-trace 
time.  The  time  spent  on  the  Sullivan  scheme  is  essentially  zero. 


Total  Computation 

Ray  Tracing  Time 

Ray  Summation 

Direct  Convolution 

3.9  hrs. 

40  min. 

3.2  hrs. 

Sullivan  Scheme 

41  min. 

40  min. 

1  min. 

3  XPATCH  VALTDATION  FOR  TNLETS 

Figs.  1-3  show  the  comparison  of  the  ISAR  imageries  generated  at  various  frequencies 
using  Xpatchl  andXputc/d  versus  a  benchmark  modal  result  for  an  open-ended  rectangular 
duct  of  dimensions  1  m  x  1  m  x  8  m.  The  angular  span  used  to  generate  the  modal  and  the 
Xpatchl  data  is  +3°  about  the  central  angle.  The  bandwidth  used  is  10%  of  the  center 
frequency.  The  images  under  column  (a)  are  for  normal  incidence  and  those  under  column  (b) 
are  for  45*^  incidence.  From  Fig.  1,  we  observe  that  the  modal  and  the  Xpatchl  results  are 
almost  indistinguishable  at  10  GHz  (duct  opening  of  33.3  k).  The  Xparc/ii  result  adequately 
predicts  the  location  of  the  termination  contribution,  but  shows  less  spread  in  the  ISAR  plan. 
This  is  because  the  termination  contribution  is  due  to  highly  multiple  bounce  returns.  Since 
XpatchS  uses  the  ray  trace  infomiation  at  only  one  central  angle,  the  actual  ray  path  fluctuations 
at  near-by  angles  is  not  predicted.  Figs.  2  and  3  show  the  same  set  of  comparisons  at  5  ^Hz 
and  2  GHz,  respectively.  We  obsen/e  that  even  at  2  GHz,  where  the  duct  opening  is  only  6.7 
k,  the  location  of  the  return  is  still  well  predicted  by  Xpatch.  The  amplitude  of  the  Xpatch 
return  does,  however,  show  more  discrepancy  in  comparison  to  the  modal  benchmark  at  lower 
frequencies  Fig.  4  shows  the  ISAR  images  for  a  full-size  aiqplane  simulated  using  Xpatchl 
mdXpatc^i3  at  10  GHz  with  a  bandwidth  of  1  GHz.  The  look  angle  is  30°  from  nose-on.  The 
Xpatchl  image  is  generated  using  an  aspect  sweep  of  ±3°.  In  addition  to  the  obvious  point 
scatterers  on  the  target,  we  notice  a  large  cloud  in  the  image  which  is  due  to  the  returns  from 
the  left  inlet  duct.  This  identification  is  quite  obvious  if  one  refers  back  to  the  earlier  examples 
on  the  canonical  duct.  The  comparison  between  the  Xpatchl  image  and  XpatchS  image  is 
good. 
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Finally,  dynamic  ISAR  simulation  when  moving  parts  such  as  rotating  engine  blades 
exist  in  the  duct  is  presented.  In  range-Doppler  imaging  the  Doppler  frequency  shifts  from 
different  parts  of  the  target  are  linearly  related  to  the  cross-range  location  of  the  scatterers  on  the 
target.  When  moving  engine  blades  are  present,  additional  Doppler  shifts  are  produced  which 
strongly  affect  the  ISAR  image.  This  phenomenon  is  of  panicular  importance  since  it  can  be 
exploited  as  an  identification  feature  and,  therefore,  its  effect  on  the  ISAR  imagery  must  be 
fully  understood.  Simulated  ISAR  imagery  of  an  open-ended  duct  with  rotating  rotor  blades  is 
shown  in  Fig.  5.  The  dynamic  simulation  of  the  ISAR  image  was  done  by  calculating  the 
multi-frequency  scattered  field  data  at  different  time  snapshots  on  the  target  with  both  the  fan 
and  the  target  rotating.  The  Doppler  modulation  caused  by  the  moving  part  is  clearly  evident  in 
the  ISAR  image.  The  periodic  motion  of  the  fan  blades  gives  rise  to  harmonics  in  the  Doppler 
spectra  with  a  fundamental  frequency  fp  =  (fan  spin  rate)  x  (#  of  blades)  [7]. 


ACKNOWLEDGEMENT 

This  work  was  supported  by  NASA  Grant  NCC3-1589  and  in  pan  by  the  Joint  Services 
Electronics  Program  under  Contract  No.  AFOSR  F49620-92-C-0027. 


REFERENCES 

[1]  H.  Ling,  R.  Chou  and  S.  W.  Lee,  “Shooting  and  bouncing  rays:  calculating  the  RCS  of 
an  arbitrary  shaped  cavity,”  IEEE  Trans.  Antennas  Propagai.,  vol.  AP-37,  pp.  194-20S, 
Feb.  1989. 

[2]  J.  Baldauf,  S.  W.  Lee,  L.  Lin,  S.  K.  Jeng,  S.  M.  Scarborough  and  C.  L.  Yu,  “High 
frequency  scattering  from  trihedral  corner  reflectors  and  other  benchmark  targets:  SBR 
versus  experiment,” /£'£’£'  Trans.  Antennas  Propagat.,  vol.  AP-39,  pp.  1345-1351,  Sept. 
1991. 

[3]  D.  J.  Andersh,  M.  Hazlett,  S.  W.  Lee,  D.  D.  Reeves,  D.  P.  Sullivan  and  Y.  Chu, 
"Xpaich.  A  high  frequency  electromagnetic-scattering  prediction  code  and  environment 
for  complex  three-dimensional  objects,”  IEEE  Antennas  Propagat.  Mag.,  vol.  6,  pp.  65- 
69,  Feb.  1994. 

[4]  T.  D.  Sullivan,  "A  technique  of  convolving  unequally  spaced  samples  using  fast  Fourier 
transforms,"  Sandia  National  Laboratories,  SAND89-0077,  Jan.  1990. 

[5]  R.  Bhalla  and  H.  Ling,  "Image-domain  ray-tube  integration  formula  for  the  shooting  and 
bouncing  ray  technique,”  Tech.  Rept.,  Univ.  of  Texas,  April  1993.  Also  submitted  for 
publication  in  Radio  Science. 

[6j  R.  Bhalla  and  H.  Ling,  "A  fast  algorithm  for  signature  prediction  and  image  formation 
using  the  shooting  and  bouncing  ray  technique,"  Tech.  Rept.,  Univ.  of  Texas,  January 
1994.  Also  accepted  for  publication  in  IEEE  Trans.  Antennas  Propagat. 

[7]  R.  Bhalla,  H.  Ling,  S.  W.  Lee  and  D.  J,  Andersh,  "Dynamic  simulation  of  Doppler 
spectra  of  targets  with  rotating  parts,"  Microwave  Optical  Tech.  Lett.,  vol.  7,  pp.  840- 
842,  December  1994. 
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Abstract — The  iterative  physical  optics  method  has  been  shown  to  be  very  useful  in  the  anal¬ 
ysis  of  electromagnetic  coupling  and  propagation  in  electrically  large  and  smoothly  varying  ducts 
such  as  jet  inlet  cavities.  In  this  approach  the  incident  field  excites  the  first  order  physical  optics 
(PO)  currents  on  the  walls  of  the  cavity,  which  are  then  allowed  to  re-radiate  a  finite  number 
of  times  to  account  for  multiple  reflections.  Each  re-radiation,  or  iteration,  essentially  provides 
the  next  higher  order  interaction.  Convergence  is  very  rapid  once  all  the  important  higher  order 
multi-bounce  effects  have  been  included.  The  same  method  may  be  applied  to  external  radiation 
and  scattering  problems  where  multiple  reflections  and  diffractions  are  important;  the  iterations 
provide  multiple  reflections  and  diffractions,  to  within  the  accuracy  of  PO,  without  ray  tracing. 
Numerical  results  are  presented  to  demonstrate  the  accuracy  and  convergence  of  the  method  for 
2-D  and  3-D  cavity  and  exterior  multi-bounce  problems. 


1  Introduction 

The  analysis  of  electromagnetic  (EM)  penetration  and  propagation  in  electrically  large  cavities  is 
important  for  predicting  the  scattering  by  jet  inlets  and  exhausts,  and  for  studying  the  EM  field 
distributions  inside  microwave  reverberation  chambers.  A  variety  of  high-frequency  asymptotic 
methods  have  been  developed  and  applied  to  this  problem  [1-3].  In  the  hybrid  modal  method  [1], 
the  cavity  is  modeled  with  sections  of  uniform  waveguides  and  the  natural  waveguide  modes  are 
used  to  describe  the  interior  fields.  This  limits  the  method  to  canonical  shapes,  and  materials  and 
geometric  perturbations  are  difficult  to  incorporate.  However,  it  yields  very  accurate  results  and 
is  often  used  to  provide  reference  solutions  for  approximate  methods  which  are  more  versatile. 
The  shooting  and  bouncing  ray  (SBR)  method  [2]  and  the  generalized  ray  expansion  (GRE)  [3] 
find  the  fields  inside  cavities  by  launching  a  dense  grid  of  ray-tubes  from  a  source  (or  sources)  and 
tracking  each  ray-tube  through  multiple  reflections  from  the  inner  cavity  walls.  These  methods 
can  handle  much  more  arbitrary  geometries  with  material  coated  walls,  but  have  limited  accuracy 
and  may  require  a  very  large  number  of  rays  to  be  tracked.  Furthermore,  parameters  associated 
with  the  ray  launching,  such  as  the  ray-tube  density  and  the  discretization  of  sources,  and  the 
means  of  obtaining  volumetric  fields  from  discrete  ray-tubes  is  generally  not  robustly  defined. 

The  iterative  physical  optics  (IPO)  method  has  recently  been  applied  to  analyze  the  scatter¬ 
ing  by  large  open-ended  cavities  [4].  In  this  method,  physical  optics  (PO)  currents  [5]  excited 
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by  an  incident  source  distribution  replace  the  inner  walls  of  the  cavity.  The  PO  currents  then 
re-radiate  iteratively  to  excite  higher-order  PO  currents  which  account  for  the  multiple  reflec¬ 
tions  inside  the  cavity  -  each  iteration  adds  one  more  internal  reflection.  The  method  is  very 
robust  because  it  only  involves  the  integration  of  equivalent  surface  currents  existing  over  the 
walls  and  apertures.  Material  coated  or  impedance  surfaces  may  be  approximately  modeled  via 

an  equivalent  surface  impedance.  ■  c  u  ■  i 

The  IPO  method  may  be  formulated  as  an  iterative  solution  to  the  magnetic  field  integral 
equation  (MFIE),  but  with  some  additional  rules  regarding  shadowing  effects  associated  with  the 
high-frequency  approximations  of  PO.  The  simple  rules  avoid  the  the  problem  of  finding  shadow 
boundaries  because  shadowing  effects  will  automatically  be  included  in  the  iteration  P^oces^ 
They  also  make  the  iterative  solution  of  the  MFIE  very  rapidly  convergent  {to  within  the  PO 
approximation)  and  resonances  are  avoided  because  the  surface  is  usually  not  closed.  The  IPO 
method  is  much  more  efficient  than  an  exact  iterative  solution  to  an  integral  equation  because 
the  number  of  iterations  is  related  to  the  number  of  high-frequency  interactions  of  importance 
(i.e.,  reflection,  diffraction,  and  reflection-diffraction  mechanisms),  and  does  not  depend  on  the 
minimization  of  a  residual  error.  It  is  also  more  efficient  than  an  exact  integral  equation  solution 
because  the  discretization  density  may  be  only  4  to  16  samples  per  square  wavelength  instead 
of  the  usual  64  to  100, 

The  IPO  method  may  also  be  applied  to  other  multi-bounce  problems,  such  as  the  propap- 
tion  of  communications  signals  in  and  around  buildings  in  an  urban  environment.  Each  iteration 
adds  another  higher  order  reflection,  diffraction,  or  reflection-diffraction  mechanism  until  all  the 
significant  high-frequency  interactions  are  included. 

The  IPO  algorithm  is  presented  in  the  next  section  using  a  formulation  based  only  on  the 
magnetic  field  and  its  associated  equivalent  electric  surface  currents  (from  which  the  electric  field 
may  subsequently  be  found).  This  greatly  simplifies  the  implementation  because  singularities  in 
the  kernel  of  the  integral  equation  are  avoided.  If  material  surfaces  are  present,  the  procedure 
requires  a  slightly  more  complicated  combined  field  formulation  using  an  equivalent  surface 
impedance.  Numerical  results  and  conclusions  are  presented  in  the  last  section.  An  harmonic 
time  dependence  is  assumed  and  suppressed  throughout. 


2  The  IPO  Algorithm 

Figure  1(a)  shows  a  typical  cavity  penetration/scattering  geometry.  An  external  plane  wave 
{E\H')  is  incident  on  the  aperture  of  the  cavity  5a.  It  is  of  interest  to  find  the  total  fields 
in  the  cavity  interior  and  the  external  fields  scattered  by  the  cavity  The 

fields  (E\,H\)  scattered  by  external  features  of  the  geometry  containing  the  cavity  are  not  of 
interest  here,  but  could  also  be  computed  using  PO.  Figure  1(b)  shows  the  equivalent  current 
problem  formulated  using  only  electric  surface  currents.  The  aperture  current  has  an  inciden 
and  scattered  component, 

J(fa)  - 

where  the  incident  current  is  assumed  from  the  Kirchhoff  approximation  to  be 

J‘(ra)  =  2n  X  H'(ra). 


(2) 


(a)  Original  geometry.  (b)  Equivalent  current  problem. 

Figure  1:  Open-ended  cavity  illuminated  by  an  external  plane  wave. 

It  is  assumed  that  J'  radiates  into  the  cavity,  exciting  the  PO  wall  currents  J.  At  any  point 
f  inside  the  cavity,  the  total  magnetic  field  is  then  the  sum  of  the  incident  field  radiated  by  J‘ 
plus  the  field  radiated  by  J: 

H(f)  =  H'(n  + J{f'j  X  VG„(r  -  f[)dS‘  (3) 

where 

H'Af)  =  I  J'iK)  X  VG,(f  -  f:)dS'  (4) 

vGAh)  =  (5) 

The  wall  current  J  is  found  approximately  by  applying  the  laws  of  PO  iteratively.  The  initial 
value  Jo  is  the  first  order  PO  current  given  by 

Jo(fc)  =  2n  X  U\{rc).  (6) 

The  next  value  Ji  is  the  sum  of  the  first  value  Jo  plus  a  principal  value  integral  over  Jq: 

=  Mfc)  +  2"  X  Jo(r')  X  VG^{fc  -  r;)iiS' 

=  2n  X  Siif,)  +  2n  X  £  MK)  x  VG„(f=,  -  f',)dS'.  (7) 

As  will  be  seen  shortly,  the  principal  value  of  the  integral  over  Sc  is  used  so  that  the  iteration 
procedure  leads  directly  to  a  form  of  the  conventional  magnetic  field  integral  equation  (MFIE). 
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The  value  for  the  current  after  two  iterations  is  the  sum  of  the  first  Ji  plus  an  integration 
over  the  difference  J\  —  Jo' 

J2{rc)  =  Ji{rc)  +  2n  X  -/  [ji(r;)  -  Jo(f')]  x  VGo{fc  - 

J  Sc 

=  2nxHi{f,)  +  2hx£  J.  (r-;)  X  VG,(r=,  -  f'JdS'.  (8) 

Each  subsequent  iteration  reduces  to  the  same  result,  i.e., 

"^iv(^c)  =  2n  X  Hcirc)  +  2n  x  ^  x  VG'o(fc  —  r')d5'  (9) 

which  has  the  conventional  iterative  form  of  the  MFIE.  However,  it  is  important  to  keep  in 
mind  that  this  iteration  procedure  is  not  expected  to  converge  to  the  exact  solution  because 
the  surface  is  not  closed.  Furthermore,  a  rule  based  on  physical  insight  into  the  theory  of 
PO  is  used  when  numerically  evaluating  the  integral  in  (9).  First,  the  surface  is  discretized  into 
flat  facets  and  the  current  is  assumed  to  be  constant  over  each  facet.  The  integral  in  (9)  over 
a  facet  at  f[  radiating  to  a  facet  at  is  non-zero  only  if  the  facet  at  fc  “faces  the  facet  at  f^. 
Mathematically,  this  condition  is  defined  by 

n{rc)  ■  (^c  —  ^c)  <  ® 

where  n(fc)  is  the  unit  surface  normal  of  the  facet  at  and  is  enforced  regardless  of  any 
intervening  wall  facets.  This  definition  helps  the  algorithm  to  handle  shadowing  effects  caused 
by  convex  protrudances  in  the  cavity  walls,  without  searching  for  shadow  boundaries.  As  the 
numerical  results  will  show,  it  is  very  rapidly  convergent  even  for  fairly  complex  geometries.  It 
is  noted  that  the  evaluation  of  H‘(fc)  using  (4)  follows  the  same  rule,  with  f'  substituted  for  f'. 

Once  a  reasonable  approximation  for  J  has  been  reached,  the  scattered  current  in  the  aper¬ 
ture  is  then  given  by  (again  using  the  Kirchhoff  approximation) 

r(f„)  =  -2nx/  J{K)x^Go{ra-rMS'  (H) 

J  Sc 

which  radiates  the  scattered  fields  into  the  external  region. 

3  Numerical  Results 

In  [4],  results  are  shown  which  indicate  that  a  discretization  density  of  9  facets  per  square 
wavelength  gives  very  accurate  results  for  smoothly  varying  inlet  duct  geometries  larger  than 
a  few  wavelengths  in  diameter.  4  facets  per  square  wavelength  has  also  been  shown  to  give 
adequate  results,  especially  for  larger  duct  geometries.  This  rather  coarse  discretization  is 
attainable  because  the  fields  may  be  sampled  at  the  Nyquist  rate  over  smooth  surfaces  which 
are  slowly  varying  with  respect  to  wavelength. 

Figure  2  demonstrates  the  convergence  of  the  algorithm  in  predicting  the  radar  cross  section 
(RCS)  patterns  of  a  cylindrical  cavity,  and  compares  the  results  with  a  modal  reference  solution. 
The  N==0  case  corresponds  to  simply  using  the  first  order  PO  currents  given  by  (6)  without 
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any  iteration.  Adding  iterations  gives  more  internal  reflections  which  become  significant  for 
wider  incidence  angles.  For  a  maximum  incidence  angle  of  50°,  3  iterations  is  adequate  for  this 
geometry.  Of  course,  a  deeper  cavity  would  require  more  iterations  because  there  would  be  more 
internal  reflections  of  importance. 

Figure  3  shows  the  RCS  patterns  of  a  cylindrical  cavity  with  a  hub  termination.  The  modal 
reference  solution  uses  the  mode-matching  technique  and  was  obtained  from  John  Volakis  and 
Hristos  Anastassiu  at  the  University  of  Michigan  Radiation  Laboratory,  Ann  Arbor.  CICERO 
is  a  BOR  moment  method  code  developed  by  McDonnell-Douglas,  St.  Louis,  MO.  The  results 
show  that  good  accuracy  is  attainable  for  complex  geometries  at  a  low  computational  cost  using 
the  IPO  algorithm. 

Figure  4  shows  a  2-D  external  multi-bounce  problem  consisting  of  a  line  source  in  the  presence 
of  three  square  blocks  (buildings).  The  far-field  radiation  pattern  is  computed  using  the  IPO 
algorithm  and  compared  with  a  moment  method  solution  (MoM).  For  more  complex  geometries 
such  as  this,  a  slightly  modified  form  of  (9)  has  been  found  to  have  better  convergence  properties: 

JN{rc)  =  n  X  Illifc)  +  ”  ^  ^  x  VGo{fc  -  (^2) 

The  initial  value  Jo  is  still  given  by  (6).  Figure4  shows  a  reasonably  well-converged  IPO  result 
after  only  4  iterations,  and  which  after  10  iterations  has  converged  to  within  graphical  resolution. 
The  agreement  with  the  MoM  solution  is  excellent,  considering  that  the  approximations  of  PO 
have  been  used  according  to  the  rule  defined  by  Equation  (10). 
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Figure  2:  RCS  patterns  of  a  4-by-4  wavelength  cylindrical  cavity  with  a  flat  termination 
cm,  Frequency=lO  GHz.  9  facets  per  square  wavelength  used  for  IPO  results. 
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Figure  4:  Far-field  pattern  of  an  electric  line  source  radiating  in  the  presence  of  2-D  buildings 
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Abstract 

A  hybrid  analysis  of  the  electromagnetic  scattering  by  electrically  large  elongated  open 
cavities  containing  a  large  interior  termination  is  developed  in  this  paper  when  the  illumi¬ 
nation  is  from  the  exterior  region.  The  analysis  is  divided  into  three  basic  parts.  One  of 
these  parts  deals  with  the  external  scattering  from  the  open  end  being  illuminated,  and 
also  with  the  couphng  of  the  rest  of  this  illumination  just  inside  the  cavity  via  the  aperture 
formed  at  the  opening.  The  other  two  parts  deal  with  the  propagation  of  the  latter  cavity 
coupled  field  through  its  length  to  a  region  near  the  interior  termination  at  the  other  end  of 
the  cavity,  and  with  the  scattering  of  these  fields  by  the  interior  termination,  respectively. 

The  analysis  of  the  three  parts  can  be  done  separately  by  methods  best  suited  for  each  one 
and  then  combined  systematically  via  generalized  reciprocity  relations.  The  propagation 
region  analysis  is  performed  via  ray  methods,  and  in  particular  a  highly  efficient  new  ray 
tube  basis  set  is  employed  which  tends  to  track  the  shape  of  the  ray  tubes  as  they  propagate 
within  the  cavity  thereby  reducing  the  number  of  ray  tubes  by  an  order  of  magnitude  over 
that  required  when  using  conventional  ray  tubes.  In  this  paper,  special  emphasis  is  given 
to  the  propagation  region  analysis. 


I  Introduction 


The  electromagnetic  (EM)  scattering  from  an  externally  illuminated  electrically  large  open  cav¬ 
ity  of  relatively  arbitrary  shape  and  containing  a  large  interior  termination  is  analyzed  via  a 
hybrid  approach.  Figure  1  illustrates  the  geometry  of  this  problem.  It  is  assumed  that  the 
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medium  surrounding  the  cavity  is  free  space.  The  scattering  from  the  external  features  of  the 
cavity  and  its  housing  are  not  of  interest  here  except  for  the  scattering  from  the  open  end  being 
illuminated;  however,  if  desired,  these  other  external  effects  can  in  general  be  included  via  the 
uniform  geometrical  theory  of  diffraction  (UTD)  and  its  modifications  [1].  The  scattering  from 
the  open  end  region  being  illuminated,  henceforth  referred  to  as  region  1,  can  be  done  by  UTD 
and  its  modifications,  or  by  numerical  methods,  to  provide  not  only  the  external  fields  scattered 
by  the  opening,  but  also  the  fields  coupled  just  inside  the  cavity  via  the  open  end.  The  fields 
coupled  into  region  1  then  propagate  inside  the  elongated  cavity  region  which  is  referred  to  as 
region  2.  Region  1  analysis  also  provides  the  equivalent  sources  over  the  cavity  cross  section  Sa 
just  inside  the  opening  to  couple  into  region  2  via  the  generalized  ray  expansion  (GRE)  [2,  3]. 
The  GRE  launches  a  discrete  set  of  ray  tubes  from  an  array  of  points  over  the  sources  m  and 
propagates  them  into  region  2  via  ray  bounces  at  the  interior  cavity  walls.  The  ray  paths  need  to 
be  found  only  once  in  the  GRE  for  a  given  cavity  geometry;  only  the  strengths  of  the  rays  change 
with  illumination  but  the  ray  paths  do  not.  The  cavity  is  assumed  to  be  perfectly  conducting, 
but  the  interior  cavity  walls  may  contain  a  thin  lossy  material  coating.  The  ray  tubes  are  ini¬ 
tially  assumed  to  have  a  circular  cross-section  which,  upon  reflection  can  become  elliptical.  The 
conventional  GRE  ray  tubes  are  chosen  to  be  sufficiently  thin  so  that  their  cross-section  can  be 
approximated  as  being  circular  even  after  each  bounce;  however,  the  use  of  this  approximation 
requires  an  extremely  large  number  of  ray  tubes  because  they  are  thin.  A  new  set  of  ray  tubes 
with  an  elliptic  cross-section  are  thus  introduced  into  the  GRE;  these  also  start  out  with  a  circu¬ 
lar  cross-section  but  simple  rules  are  developed  to  track  their  shape  after  each  reflection.  These 
new  tubes  are  referred  to  as  elliptic  ray  basis  functions  (ERBF’s)  and  their  propagation  rules 
predict  how  their  elliptic  cross-section  can  rotate  and  change  shape  with  propagation  distance. 
The  latter  information  allows  the  ERBF’s  to  be  much  fatter  than  the  conventional  narrow  tubes. 
Typically,  an  order  of  magnitude  fewer  ERBF’s  than  conventional  ray  tubes  may  be  used.  Thus, 
the  ERBF’s  make  the  GRE  more  efficient.  The  GRE  fields  are  evaluated  over  a  cavity  cross- 
section  St  at  the  end  of  region  2  near  the  interior  termination.  The  termination  region  beyond  St 
is  designated  as  region  3.  The  GRE  fields  at  St  illuminate  the  termination  which  then  scatters 
fields  back  to  St]  the  latter  fields  are  found  separately  by  numerical  methods  or  by  asymptotic 
high  frequency  approximations  if  applicable.  An  analysis  of  these  three  regions  can  be  performed 
separately  by  methods  best  suited  for  each  region  and  then  combined  systematically  to  arrive 
at  a  hybrid  solution.  This  hybrid  scheme  is  summarized  next  in  Section  II  where  the  separate 
analyses  of  regions  1,  2  and  3  are  briefly  discussed  in  Parts  A,  B  and  C  of  Section  II,  and  these 
analyses  are  then  systematically  combined  in  Section  D  via  some  generalized  reciprocity  rela¬ 
tionships.  Section  III  presents  some  numerical  results.  An  time  dependence  is  assumed 

and  suppressed. 

II  Summary  of  the  Hybrid  Procedure 

The  field  scattered  by  the  electrically  large  cavity  to  some  external  point  P  sufficiently  far  from 
it,  when  it  is  illuminated  by  an  external  source  as  in  Figure  1,  can  be  expressed  as  [2,3] 
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where  E  refers  to  the  electric  field  and  the  superscript  s  refers  to  the  scattered  component  of 
this  field.  Here  is  the  field  directly  scattered  into  the  exterior  region  by  the  edge  of 

the  open  end  being  illuminated,  whereas  in  (1)  is  the  field  scattered  into  the  exterior  by 

the  interior  cavity  effects  such  as  the  interior  walls  and  the  termination.  The  term  E^^^{P)  in 
(1)  represents  the  contribution  to  E*{P)  which  results  from  other  external  features  of  the  cavity 
{e.g.,  the  cavity  housing)  which  is  not  of  interest  in  the  present  work  and  will  thus  be  neglected. 
However,  E*^^{P)  can  generally  be  found  via  the  UTD  and  its  modifications  [i].  It  is  important  to 
note  that  is  generally  the  dominant  contributor  to  E*{P)  in  (1)  as  compared  to  E‘^^{P), 

this  being  the  case  due  to  the  electrically  large  termination  which  is  assumed  to  exist  within  the 
cavity. 

A  Analysis  of  Region  1  (Aperture  Region) 

The  open  end  of  the  cavity  being  illuminated  forms  an  aperture  5.  The  analysis  of  this  aperture 
region  involves  two  parts.  In  one  part,  scattered  externally  to  P  by  the  edge  of  S  which 

is  illuminated  must  be  found;  in  the  other,  the  remaining  field  P“(Fa)  that  is  coupled  from  the 
external  illumination  via  S  to  any  point  P,  in  an  interior  aperture  Sa  just  inside  the  cavity  as 
shown  in  Figure  1  must  also  be  found.  If  the  incident  field  E'  on  S  is  ray  optical,  then  E"^^{P) 
may  be  found  via  the  UTD  [1]  or  in  a  more  general  fashion  via  the  equivalent  current  method 
(ECM)  as  indicated  in  [1,  3].  In  the  special  but  generally  rare  situations  that  ECM  becomes 
inapplicable  (e.g.,  if  is  not  ray-optical  and  if  the  relevant  equivalent  currents  cannot  be  found) 
a  numerical  method  of  solution  must  be  employed  to  find  E^-^{P)]  such  a  method  is  currently 
under  investigation.  Likewise,  the  field  E'^{Pa)  at  any  point  P^  in  So.  may  be  found  via  the 
UTD  and  its  modifications  [1].  Again,  if  for  any  reason  £'®{Pj)  cannot  be  found  via  UTD  or  its 
appropriate  modifications  (such  as  ECM)  then  one  must  resort  to  numerical  solution  techniques. 

B  Analysis  of  Region  2  (Propagation  Region) 

The  electric  and  magnetic  fields  (£'“,P“)  at  any  point  Pa  in  Sa  found  in  part  A  define  a  set 
of  equivalent  surface  currents  on  Sa  that  launch  the  fields  (P'",//'")  into  the  cavity 

interior. 

=  nx  ;  A/;  =  X  n|^  (2;3) 

The  generalized  ray  expansion  (GRE)  is  employed  to  find  (P'”,P'")  from  J“  and  in  Sa- 
Actually, 

+  ;  P’”  =  +  (4;5) 

where  {E'^,H'^)  are  the  fields  which  propagate  from  Sa  to  a  mathematical  surface  St  chosen 
sufficiently  near  the  termination;  whereas,  {E]f,H'f)  are  part  of  the  fields  which  after  being 
launched  from  Sa  return  to  Sa  without  reaching  St-  The  {E'I,H]f)  generally  exist  if  the  interior 
cavity  walls  are  tapered  between  Sa  and  St-  According  to  the  GRE,  Sa  is  divided  into  C  subaper¬ 
tures  such  that  Sa  =  where  the  size  of  A5/  is  dependent  mostly  on  the  overall  length 

of  the  cavity  Lc  from  Sa  to  5,.  In  particular,  the  maximum  linear  dimension  Di  of  the  largest 
ASi  should  typically  be  smaller  than  \/2Lc\  where  A  is  the  free  space  wavelength,  and  Di  should 
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(a)  Ray  field  originating  from  the  phase 
center  0/  of  the  lih  subaperture  of  area 
A5i  (shaded). 


central 


(b)  Grid  of  ray  tubes  originating  from  Oi 
of  any  subaperture  of  area  A5/. 


Figure  2:  Launching  of  GRE  rays. 


also  be  less  than  half  the  maximum  transverse  dimension  of  5o.  This  choice  of  ASi  allows  the 
fields  radiated  into  the  cavity  by  ( J“,  M;)  to  be  approximated  by  a  superposition  of  the  fields  of 
spherical  waves  originating  from  the  phase  centers  O/,  within  A5i,  of  each  of  the  C  subapertures. 
For  example,  the  fields  launched  from  Oi  in  ASi  of  the  /th  subaperture  (/  =  1,2, •••£)  can  be 
expressed  as  a  superposition  of  the  fields  of  P  non-overlapping  ray  tube  fields  emanating  radially 
out  from  Oi  as  in  Figure  2(a).  The  discrete  number  of  P  ray  tubes;  i.e.,  the  size  and  hence  the 
density  of  ray  tubes  is  chosen  to  adequately  represent  the  field  within  each  ray  tube  via  the  rules 
of  ray  optics  as  each  ray  tube  propagates  via  interior  reflections  along  the  entire  length  of  the 
cavity.  At  present,  any  diffraction  effects  at  the  smooth  interior  cavity  walls  are  neglected;  this 
is  generally  reasonable  since  the  interior  wave  effects  are  mostly  dominated  by  multiple  internal 
reflections.  Thus, 

E'ns.)  =  ;  e=(S.)  ^  't't  E%(s.)  (6;7) 

1  =  1  p^l  l=\  "1  =  1 


where  is  the  field  of  the  ray  tube  launched  from  Ot  in  the  /th  subaperture  ASi  that 

ultimately  reaches  some  point  in  St  after  N  bounces.  The  £^^;(5o)  has  an  analogous  interpre¬ 
tation.  Even  though  the  ray  tube  discretization  shown  in  Figure  2(b)  does  not  automatically 
produce  ray  tubes  with  circular  cross-section,  one  can  define  an  “effective”  circular  cross-section 
for  each  ray  tube  at  launch.  Thus,  the  radius  “a”  of  the  effective  circular  cross-section  of  the  ray 
tube  is  given  adequately  by  a  Rs  where  d  is  the  propagation  distance  along  the  axial  ray 
in  the  tube  measured  from  its  launch  point,  and  A0  is  the  angle  between  any  pair  of  adjacent 
ray  tubes  that  emanate  from  that  same  launch  point.  The  field  E'JKSt)  at  a  point  in  St  can  be 
expressed  in  terms  of  its  components  E^  and  E^,  which  are  respectively  ||  and  1  to  the  plane 
of  incidence  defined  at  the  point  of  last  (A^th)  reflection  before  reaching  a  point  in  Sf  Thus,  m 
matrix  notation,  E'Ji{St)  becomes  [3]: 


detg„(rp,) 

detg,(0) 


^-jkfpo 


r 


po 
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.  (/  (l  _  [x,NVQ%(r,t,)[x,K])  ■  (8) 

where  the  notation  HJLi  0,  =  Qat  ■  Qn-i  ■  ■  ■  fls  •  fl2 '  is  employed  in  (8).  The  field  reaches 
St  from  Oi  via  9  =  1,2,3,- ••  bounces  as  in  Figure  2(a).  Note  that  Vpo  is  the  distance  from 
Oi  to  the  point  of  first  reflection  (q  =  1)  along  the  ray.  Also  Cp,  is  the  distance  only  from 
the  q  -  1th  reflection  to  the  next,  or  q‘^,  reflection  along  the  ray;  thus,  Cp/v  is  the  distance 

along  the  ray  from  the  last  (or  A^th)  reflection  to  a  point  in  St.  The  [Tip,]  = 

the  usual  2x2  Fresnel  reflection  coefficient  matrix  defined  with  respect  to  the  plane  of  incidence 
fixed  at  the  q^^  reflection  of  the  ray.  The  2x2  matrix  [7^,]  transforms  the  field  polarization 
of  the  p*^  ray  before  the  q^^  reflection  occurs  to  one  fixed  in  the  q^^  plane  of  incidence  from  that 
fixed  in  the  previous  or  the  (q  —  l)th  plane  of  incidence,  because  the  plane  of  incidence  can  in 
general  change  at  each  reflection  on  an  arbitrary  curved  interior  cavity  wall.  The  1x2  column 
matrix  [Cpi]  denotes  the  two  orthogonal  components  of  the  vector  radiation  pattern  Cpi  of  the 
electric  field  along  the  p‘^  ray  launched  from  0/  wdth  the  cavity  walls  absent. 

Cp,  =  ~  JJ  \fpo  X  fpo  X  r,(f',)  +  KoAo  X 

AS, 

(9) 


«I1  0 

0  R± 


with  >0  =  ^0  ^  and  Zq  is  the  free  space  impedance.  Also  rj  is  any  point  in  ASi  w'hich  is  measured 

from  Ot-  It  is  noted  that  Cpi . .  constitutes  the  p‘*  ray  electric  field  incident  at  the  first  point 

of  reflection  (q  =  1).  The  (59(')  denotes  the  wavefront  curvature  matrix  of  the  p**  ray  after 
the  q^^  reflection;  in  particular,  Qg(0)  is  the  reflected  wavefront  curvature  matrix  of  the  p'^  ray 
evaluated  at  the  9'^  point  of  reflection  while  Qq(rpg}  is  its  value  at  the  next  point  of  reflection 
9+1  after  this  ray  propagates  a  distance  Tp,.  The  elements  of  the  1x2  matrix  [a-pA-j  are  the 
coordinates  along  the  j|  and  J_  directions  fixed  in  the  plane  of  incidence  of  the  p'^  ray  at  the  last 
(A'th)  point  of  reflection.  The  magnetic  field  associated  with  E^^  is  ~  VbfpA  x  E'^f.  The 
GRE  field  of  the  p*^  ray  from  O;  is  the  same  as  the  conventional  one  given  previously  in  [2,3] 
except  for  the  new  modification  contained  in  U  of  (8)  which  tracks  the  shape  of  the  p‘^  tube 
as  it  propagates  via  reflections.  The  shape  is  an  ellipse  whose  boundary  changes  (rotates  and 
changes  size)  with  propagation.  Hence  the  GRE  field  in  (8)  is  said  to  be  given  in  terms  of  the 
ERBF’s.  The  rules  for  this  change  in  shape  of  the  ray  tube  ellipse  after  each  bounce  is  given  by 
the  matrix  Qg{-)  in  (8)  as 


with 


1 

0 


1  0 
0  —  cos  0'^ 


1 

0 


Q“(0) 


1  0 

0  cos 


(10) 


Qoi'^po)  — 


(11) 


and  U{t)  =  I  q’I  ^  Q  angle  of  incidence  (or  reflection)  that  the  incident  (or 

reflected)  ray  makes  with  the  normal  to  the  interior  cavity  wall  at  the  9'^  reflection  point. 
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(a)  E'^  of  represented  by  discrete  plane  (b)  Plane  waves  shown  in  part  (a)  of  this 

waves  propagating  out  of  5,  plane  {zt  =  0)  figure  are  now  incident  on  St  associated 

associated  with  the  open  waveguide  cavity  with  the  termination  region, 

region. 

Figure  3:  Decomposition  of  original  geometry  at  5(. 

C  Analysis  of  Region  3  (Termination  Region) 

In  this  section,  the  contribution  to  the  external  scattered  field  produced  by  the  interior  cavity 
termination  is  considered  and  the  analysis  for  incorporating  the  effects  of  scattering  within  the 
cavity  by  the  termination  when  it  is  excited  by  the  fields  arriving  at  St  from  Sa  is  summarized. 
Let  the  fields  arriving  at  St  from  5a,  which  are  defined  as  be  assumed  to  exist  in  the 

ABSENCE  of  the  termination  and  with  the  region  beyond  5,  being  assumed  to  be  a  smooth 
extension  such  that  no  waves  are  reflected  back  to  St  due  to  this  extension.  Usually,  the  cross- 
section  at  St  is  circular  (as  for  example,  a  jet  inlet  cavity)  and  one  may  assume  the_ infinite 
extension  beyond  St  to  be  a  smooth  circular  waveguide  in  such  a  situation.  The  fields  (£/,/f+) 
are  thus  the  unperturbed  fields  (i.e.,  in  the  presence  of  the  smooth  extension  of  the  cavity  beyond 
St  and  with  the  termination  absent).  Next,  let  these  unperturbed  fields  {E'^,H!^)  radiate  the 
fields  {E\H'')  from  St  if  the  region  beyond  5<  is  now  removed.  These  radiation  fields  {E^,H^) 
constitute  a  Kirchhoff  approximation  because  they  are  assumed  to  be  produced  outside  St  by  the 
“unperturbed”  fields  One  may  express  E’-{xt,yt,Zt)  at  any  point  {xt,yt,zt)  external 

to  5t  as  a  plane  wave  spectral  (PWS)  integral.  The  continuous  PWS  can  be  approximated  by  a 
sufficient  number  ((5)  of  discrete  “propagating”  plane  waves,  as  shown  in  Figure  3(a); 

E^{xt,yt,zt)  « 

9=1 

Each  of  these  Q  plane  waves  are  next  made  to  illuminate  the  obstacle  or  termination  region  as 
shown  in  Figure  3(b).  Each  plane  wave  when  allowed  to  be  incident  on  St  of  the  termination 
region  (see  Figure  3(b))  produces  the  fields  {E‘^,H*^)  which  are  scattered  by  the  obsUcle  or 
termination  region.  The  scattered  fields  may  be  found  numerically  via  an  integral 

equation  or  finite  element  method,  or  via  some  approximate  high  frequency  method  such  as 
the  physical  theory  of  diffraction  (PTD)  [1]  if  applicable,  or  by  measurements  if  possible.  In 
these  analytical  high  frequency,  numerical  or  experimental  simulations  to  find  {E*^ ^  one 
illuminates  the  termination  region  geometry  of  Figure  3(b)  with  a  field  which  is  locally  a  unit 
amplitude  plane  wave  over  St,  and  which  is  incident  in  the  direction  with  the  same  polarization 
as  £J.  The  far  zone  bistatic  scattering  would  then  be  evaluated  or  measured  in  the  half  space 
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to  the  left  of  St  in  Figure  3(b).  Next,  this  far  field  data  would  be  weighted  appropriately  by 
the  amplitude  of  E'^  and  transformed  via  a  fast  Fourier  transform  (FFT)  to  give  the  values  of 
back  at  the  plane  St-  Alternatively,  could  also  be  obtained  directly  on  St 

if  one  employs  analytical  or  numerical  methods.  The  complete  termination  region  scattered  field 
at  St  due  to  all  of  the  Q  plane  waves  illuminating  St  is  given  by 

(13) 

‘  <J=1 

In  realistic  cases,  the  termination  region  must  be  truncated  smoothly  or  with  absorbers  to  mini¬ 
mize  the  external  scattering  effects  arising  from  this  isolated  region.  The  edge  diffraction  effects 
arising  from  the  rim  of  St  in  Figure  3(b)  are  unavoidable  with  the  procedure  which  employs 
a  physical  breakup  of  the  actual  cavity  into  the  isolated  propagation  and  termination  regions 
however,  these  edge  effects  are  generally  weak  in  {E^^,H*^)\  if  St  is  electrically  large.  An 

alternative  procedure  to  find  (£*^,  without  a  physical  breakup  at  St  is  also  possible  but 

is  not  discussed  here  because  it  is  not  practical  if  an  experimental  approach  is  used  to  find  these 
fields. 


D  Hybrid  Procedure  to  find  £’®„<(P) 

The  results  of  parts  B  and  C  are  next  combined  to  find  £/„^(P).  Let  a  test  electric  current  source 
Jt  be  placed  at  the  observation  point  P,  where 

Jt  =  pS{r-fp).  (14) 

Then  it  is  convenient  to  write  E■^^{P)  as  the  sum  of  two  terms 

ELiP)  =  E’:i,{P)  +  ESAP)  (15) 

where  the  p  component  of  each  of  these  two  terms  with  p  being  arbitrary  Is  shown  via  a  generalized 
reciprocity  theorem  [4]  to  be 


E£.{P)  ■  P  «  //  X  H'A  -  E'A  X  H”)  ■  di  (16) 

and  in  an  analogous  fashion 

ESdP)  •  P  «  -  Et  X  WI)  ■  ds.  (17) 

5a 

In  (16),  the  (El^.Hl^)  are  fields  produced  at  St  by  Jt  in  the  presence  of  the  cavity  but  in  the 
absence  of  the  interior  termination  assuming  that  the  region  beyond  St  is  assumed  to  extend 
smoothly  to  infinity  [4].  This  can  be  found  via  GRE  exactly  as  discussed  in  part  B  of  Section 
II  to  obtain  [E'^.H'^)  at  5i,  and  using  the  same  ray  paths.  The  {Ef,f{^)  in  (17)  are  the  fields 
produced  at  Sa  by  Jt  in  the  presence  of  the  cavity  which  is  assumed  to  extend  smoothly  to 
infinity  beyond  Sa  with  termination  removed  so  that  there  are  no  interior  reflections  beyond  Sa- 
These  fields  may  be  found  in  a  manner  analogous  to  that  used  to  find  {E°^  H^). 
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(b)  Vcrtict!  poUrittlii 


(b)  Vtrticil  poUriiKioD. 

(a)  GRE  using  84,548  conventional  ray  tubes.  (b)  GRE  using  17,436  ERBF’s. 

Figure  4:  RCS  patterns  of  a  6-by-6  wavelength  cylindrical  cavity.  GRE  used  32  subapertures. 

Ill  Numerical  Results 

Figure  4  shows  the  RCS  patterns  of  6-by-6  wavelength  cylindrical  cavity  found  using  GRE  with 
conventional  ray  tubes  and  GRE  with  ERBF’s,  and  compared  with  a  modal  reference  solution. 
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abstract  —  By  examining  the  scattering  from  inlets  terminated  by  fan-like  structures,  pos¬ 
sessing  discrete  angular  symmetry,  it  is  found  that  only  a  very  limited  amount  of  inter- 
modal  coupling  is  po.ssible.  This  fact  is  exploited  in  a  hybrid  finite  element/modal  scheme 
to  develop  a  very  efficient  solution,  where  only  one  slice  of  the  geometry  need  be  mod¬ 
eled.  At  this  presentation,  the  method  will  be  outlined  and  results  will  be  presented  for  val¬ 
idation  purposes.  Specific  attention  will  be  given  to  the  implementation  of  the  phase 
boundary  conditions  which  are  essential  in  taking  advantage  of  the  engine’s  angular  peri¬ 
odicity.  The  phase  boundary  condition  must  be  extended  to  handle  a  domain  which 
includes  the  axis  of  the  inlet  and  a  scheme  for  including  degrees  of  freedom  along  the  axis 
will  be  given.  Alternatively,  the  overlapping  modal  and  geometric  symmetries  can  be 
exploited  in  other  means  to  develop  simplified  analysis  and  characterization  schemes 
which  avoid  volume  meshing  altogether  without  compromising  the  geometrical  adaptabil¬ 
ity  of  the  formulation.  One  such  solution  scheme,  which  employs  the  limited  set  of  cou¬ 
pling  modes  as  the  basis  of  the  solution,  will  be  presented  and  validated  with  measured 
and  reference  data. 
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LO  Introduction 


The  use  of  a  hybrid  finite  element/modal  technique  to  model  the  radar  scattering  from  jet 
engine  inlets  is  depicted  in  Figure  1  [1]  [2].  Briefly,  the  finite  element  method  (FEM)  is 
employed  to  generate  a  modal  scattering  matrix  for  the  engine  face  while  some  high  fre¬ 
quency  or  modal  technique  is  used  to  trace  the  fields  in  and  out  of  the  inlet.  It  is  necessary  to 
perform  the  FEM  analysis  once  for  each  traveling  mode  in  the  circular  inlet  to  generate  the 
scattering  matrix.  That  is,  the  incoming  field  is  decomposed  into  waveguide  modes  prior  to 
the  application  of  the  FEM  and  the  FEM  is  used  only  to  generate  the  modal  scattering  matrix. 


3D  FEM  Region 


Incident  wave  is  decomposed 
into  modes 


3  layer,  metal  backed,  modal  absorber 
for  mesh  truncation 


Modal/FEM  connection  boundary 
where  modal  scattering  matrix  is  defined. 


FIGURE  1.  Hybrid  FEM/Modal  analysis. 

While  the  hybrid  finite  element/modal  technique  was  validated  in  [1]  for  an  engine-like  termi¬ 
nation  consisting  of  straight  blades,  the  electrical  sizes  considered  were  small  (approximately 
1 X  radius)  where  typically  the  inlet  can  have  a  radius  of  10}^  and  greater.  Because  the  number 
of  degrees  of  freedom  grows  as  the  square  of  the  radius,  to  apply  the  method  directly  to  large 
structures  would  invoke  computational  costs  that  are  indeed  staggering.  Also,  the  number  of 
traveling  modes  (note  that  the  analysis  must  be  repeated  for  each  mode)  and  the  size  of  the 
scattering  matrix,  grow  as  the  radius  squared.  For  the  inlet  configurations  considered  in  [1] 
approximately  50,000  elements  were  needed  with  about  20,000  degrees  of  freedom  and  the 
analysis  was  repeated  approximately  10  times,  once  for  each  mode.  Given  this,  an  inlet  which 
is  10>.  in  radius  would  require  100  times  the  computational  resources  (5,000,000  elements) 
and  the  analysis  must  be  repeated  100  times  over  (2,000,000  degrees  of  freedom,  1,000  times) 
thus  increasing  the  total  computational  cost  by  about  10,000.  In  effect,  the  computational  cost 
increases  as  the  radius  to  the  fourth  power. 

Obviously,  some  physically  derived  simplification  is  needed  to  scale  the  problem  to  a  work¬ 
able  size.  By  exploiting  the  cyclic,  geometric  symmetry  which  exists  in  an  engine  face,  it  is 
shown  that  the  entire  problem  can  be  reduced  down  to  a  single  unique  slice  of  the  geometry. 
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For  example,  if  the  engine  face  has  40  blades,  it  is  sufficient  to  only  model,  and  carry  out  the 
analysis  for  a  single  angular  period  of  the  geometry  which  encompasses  (l/40)’th  of  the  total 
computational  volume.  To  achieve  this  computational  scaling,  it  is  necessary  to  work  with 
modal  field  excitations  and  not  plane  wave  excitations.  The  modal  field  excitations  can  be 
found  by  decomposing  the  incident  plane  wave,  as  is  typically  done  for  generating  the  scatter¬ 
ing  matrix  of  the  termination.  By  exploiting  the  modal  (excitation)  and  geometric  symmetries, 
it  can  be  shown  that  a  very  limited  set  of  scattered  modes  is  possible.  It  is  also  demonstrated 
that  all  of  the  scattered  modes  have  a  constant  phase  shift  from  one  geometry  slice  to  another. 
From  a  computational  point  of  view,  since  all  scattered  modes  have  equal  phase  shift  across 
the  slice,  a  phase  boundary  condition  can  be  imposed  at  the  two  interior  faces  of  the  FEM 
mesh  to  bound  the  problem.  This  technique  has  been  used  successfully  in  [3]  and  [4]  and  is 
extended  to  3-dimensions  in  this  paper,  with  considerations  for  applying  the  phase  boundary 
condition  along  the  axis. 

We  also  introduce  an  alternative,  integral  equation  formulation  that  fully  exploits  the  limited 
mode  phenomenon  by  making  use  of  the  modal  dyadic  Green’s  function  within  the  guide.  Pre¬ 
liminary  results  for  this  method  will  be  shown  and  the  major  difficulty  of  extracting  the  singu¬ 
larity  from  the  modal  Green’s  function  will  be  discussed. 

2.0  Overlapping  Modal  and  Geometric  Symmetries 


Consider  a  unique  slice  of  an  engine-like  termination  as  shown  in  Figure  2.  Let  4)^,  be  the 
angular  extent  of  the  unique  slice  of  the  geometr>'.  For  any  fan-like  structure  <})^  =  27t/jV^ 
w^here  N^.  is  the  symmetry  number  (number  of  blades ).^Si nee  the  incident  field  will  be  a  cylin¬ 
drical  mode  with  an  angular  dependance  of  the  form  the  FEM  system  resulting  from  a 

solution  of  the  entire  problem  would  take  the  form 


(1) 


where  is  the  unknown  scattered  electric  field  in  slice  k.  Because  the  geometry  is  the  same 
in  each  sfice,  =  K^...  ==  K  '  and  by  linearity,  the  unknown  scattered  fields  must  all 

be  equal  to  within  a  phase  factor.  That  is 


E:  = 


T  ,  ±/2/i.  <|) 

eI  =  E\e 


-  e: 


±j  {N  ^  -  \  )  n  .  ({) 


(2) 


and  consequently,  the  scattered  field  is  a  periodic  function  in  (})  with  period  (|)^.  and  a  progres¬ 
sive  phase  advance  of  ’  in  each  period  (slice).  This  restricts  the  possible  scattered 

modes,  and  it  can  be  shown  that  these  scattered  modes  must  satisfy 
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'^oui  ^  ^in~^^s  w:  any  integer  (3) 

From  the  above  analysis  it  can  be  seen  that  coupling  does  not  occur  to  modes  having 
=  n-^  ±  1  since  a  symmetry  number  of  1  —  1 )  is  not  a  possibility.  This  fact  will  turn 

om  to  be' of  great  importance  for  establishing  boundary  conditions  on  the  axis  of  the  FEM 
solution.  Also,  note  that  all  of  the  possible  scattered  modes  share  a  common  phase  shift  from 
symmetry  face  1  to  symmetry  face  2.  Thus,  all  scattered  modes  are  related,  on  a  cut  of  con¬ 
stant  z,  from  face  2  to  face  1  by 


p2  „  £l 


E]  =  Eli 


where  E^  is  the  scattered  field  on  face  2  and  e’  is  the  scattered  field  on  face  1 .  This  fact  was 
first  exploited  in  [4]  to  efficiently  compute  eigenmodes  within  a  cyclotron  using  FEM.  For  the 
jet  engine  scattering  problem  at  hand,  a  phase  boundary  condition  can  be  used  to  restrict  the 
FEM  computational  region  to  a  single  slice  of  the  original  problem.  The  implementation  of 
the  phase  boundary  condition  for  three-dimensional  FEM  analysis  of  the  engine  face  is  dis¬ 
cussed  next. 


FIGURE  2.  Computational  domain  of  sliced  engine  section. 


3.0  Phase  Boundary  Conditions  for  3-D  FEM 

The  implementation  of  the  phase  boundary  condition  involves  relating  the  degrees  of  freedom 
on  face  1  to  face  2  by 


■  {xE]  +  yE])  =  p’  ■  (xEi  -r  yf  '.)  ^ 
•  {xE]  +  yEj)  =  0*  ■  (xE’  +  yEj.)  e  ^ 


After  some  algebra  (5)  becomes 


1145 


E:  = 


I  x2 


p 


17  1 


p-v-Pvl:^2 


^;  = 


Pv 


e  E . 


(6) 


El  =  £"■'""*■£! 


where  p|^  ^,^,4>|,.  ,)  p^^  are  the  components  of  the  polar  unit  vectors  at 

face  1  and  2,  respectively.  Expression  (6)  can  be  used  directly  to  assemble  degrees  of  freedom 
on  face  1  in  favor  of  degrees  of  freedom  on  face  2. 


3.1  Boundary  conditions  along  axis. 

Since  the  hybrid  FEM-Modal  formulation  as  shown  in  Figure  1  makes  use  of  an  absorbing 
layer  to  truncate  the  mesh,  the  space  between  the  engine  face  and  the  absorber  must  include 
the  axis  of  the  guide  where  the  phase  boundary  condition  cannot  be  defined.  In  practice, 
boundary  conditions  on  the  axis  must  be  imposed  differently  for  each  modal  excitation.  First, 
consider  the  behavior  of  the  modes  on  the  axis  as  shown  in  Table  1.  Since  it  was  previously 
noted  that  mode  coupling  does  not  occur  to  modes  having  =  n +  1 ,  there  will  never  be 
a  mode  from  column  1  and  column  2  which  exist  concurrently.  If  a  mode  included  in  column 
1  is  present,  i.e.  modulo  =  0,  then  the  boundary  conditions  to  be  enforced  on  the  axis 
are  =  E^  -  0  which  is  consistent  with  all  possible  scattered  modes.  If  a  mode  from  col¬ 
umn  2  is  present,  i.e.  ±  1)  modulo  =  0  then  the  boundary  condition  =  0  is 
enforced.  If  all  scattered  modes  are  such  that  >  1 ,  then  the  conditions 
E^  =  =  £_  =  0  are  imposed. 

4.0  Example 

As  an  example,  an  inlet  terminated  in  a  short  with  radius  of  0.66>.  is  analyzed  by  using  only  a 
4  degree  slice  of  the  original  problem.  The  absorber  is  placed  0.5  X  from  the  short  and  the  con¬ 
nectivity  boundary  (where  the  scattering  matrix  is  calculated)  is  located  0.25?^  from  the  short. 
The  calculated  .scattered  fields  on  the  slice  boundary  for  two  modal  excitations  as  depicted  In 
Figure  3  are  seen  to  have  the  correct  behavior  everywhere  including  the  axis.  When  calculat¬ 
ing  the  scattering  matrix,  this  termination  should  simply  generate  a  diagonal  matrix  with  each 
mode  having  a  reflection  coefficient  of  -1.  The  errors  in  our  calculations  using  a  4  degree  slice 
are  given  in  Table  2  for  the  first  five  modes.  These  errors  are  within  acceptable  ranges  for 
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finite  element  implementations  and  are  mainly  attributable  to  the  finite  reflections  from  the 
absorber  used  to  terminate  the  mesh. 


MODE 

3 

II 

n=l 

n>l 

TE 

=  0 

=  0 

TE 

=  0 

=  0 

=  0 

TM 

=  0 

=  0 

TM 

=  0 

=  0 

TABLE  1.  Behavior  of  modes  on  axis 


Mode 

TM,, 

TM„ 

_ 

TE,, 

TE,, 

%  error 

-0.99% 

-5.77% 

-r.i2% 

+2.35% 

-.06 

degree  slice. 


FIGURE  3.  Calculated  scattered  field  Re  {E A  for  TMqi  (left)  and  TEn  (right)  excitation  of  a  shorted 
inlet  using  a  four  degree  slice.  Observe  corredt  behavior  of  field  along  axis  where  phase  boundary 
condition  is  not  defined. 


5.0  Limited  Mode  Model 


The  overlapping  geometric  and  angular  symmetry  of  the  jet  engine  scattering  problem  can  also 
be  exploited  within  an  integral  equation  approach  termed,  the  Limited  Mode  Model  (LMM). 


At  the  heart  of  this  method  is  a  limited  eigenfunction  expansion  of  the  dyadic  Green’s  function 
for  the  interior  of  a  cylindrical  waveguide.  Under  single  mode  excitation,  the  dyadic  Green’s 
function  becomes  a  singly  (not  doubly)  infinite  series,  making  its  computation  highly  efficient. 

To  begin,  consider  the  general  case  of  a  metal  obstacle  within  a  cylindrical  waveguide  aligned 
with  the  z  axis.  The  unknown  induced  electric  current  J  (r')  on  the  surface  Q  of  the  obstacle 
due  to  the  excitation  can  be  found  by 


I  ( r)  =  -^G{r/r)  ■  J  {r')  ds'  (7) 

Q. 

The  dyadic  Green’s  function  for  the  cylindrical  waveguide  can  be  represented  as  an  eigenfunc¬ 
tion  (modal)  expansion[5J  which  includes  ail  waveguide  modes  traveling  in  either  direction 
away  from  the  source,  viz. 


It  =  -oop  =  0 


(8) 


JCOBo 


S('r-r') 


The  superscript  +  indicates  that  a  mode  is  traveling  in  the  +z  direction(£'^  oc  e  ^ ),  and 

the  superscript  -  corresponds  to  a  mode  traveling  in  the  -z  direction  (E'^  c<  ^ ).  The 

upper  signs  are  for  the  case  z  >  z'  and  the  lower  signs  are  for  the  case  z  <  z' . 


The  performance  of  (8)  is  known  to  be  poor  due  to  the  extremely  slow  convergence  of  the 
dyadic  Green’s  function.  However,  if  we  specialize  the  solution  to  obstacles  with  discrete 
angular  symmetry,  the  doubly  infinite  summation  in  (8)  reduces  to  a  singly  infinite  summation 
since  only  discrete  values  of  /  are  present  in  the  expansion  of  the  scattered  field  as  given  by  (3). 


An  additional  simplification  to  the  computation  is  introduced  by  expanding  the  current  as 

N. 


i  =  1 


and  using  testing  functions  of  the  form 

=  /,■.,.(?)  =  (10) 

where  m  and  n  each  take  on  every  possible  limited  mode  index  independently. 

The  most  difficult  aspect  of  the  formulation  is  the  singularity  of  the  dyadic  Green’s  function. 
Unlike  the  free  space  Green’s  function,  this  singularity  is  expressed  as  a  divergent  scries.  There 
are  three  techniques  for  handling  the  singularity  of  the  dyadic  Green’s  function:  move  the 
observation  contour  slightly,  use  a  partial  summation  technique,  or  find  an  analytic  function 
expressible  as  cylindrical  waveguide  modes  that  can  be  subtracted  from  the  divergent  series, 
numerically  integrated  and  the  added  back  in  and  analytically  integrated.  It  will  be  demon- 
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strated  that  only  the  last  technique  will  result  in  a  viable  method  although  this  has  not  been 
accomplished  to  date. 


It  will  be  shown  how  the  limited  mode  model  requires  far  less  computation  than  the  FEM 
implementation  even  though  a  dense  system  of  equation  results  (see  Figure  4),  does  not  require 
a  phase  boundary  condition,  and  most  importantly,  requires  only  a  surface  mesh,  not  a  volume 
mesh.  These  attractive  points  give  some  motive  for  finding  an  analytical  extraction  of  the 
dyadic  Green’s  function  singularity. 


DOFs 

FEM  LMM 
10,551  26 


DOFs 

FEM  LMM 


yr:' 


SHORT  {r  =  0.66X) 


,y 


14,199  38 


STUB  {r  =  0.66)i) 


FIGURE  4.  Domain  of  solution  for  LMM  and  comparison  of  numbers  of  degrees  of  freedom  in  FEM  to 

LMM. 


6.0  Conclusions 

Phase  boundary  conditions  can  be  used  to  exploit  the  symmetry  within  an  FEM  solution  of  a  jet 
engine  inlet  cavity  terminated  by  an  angularly  periodic  structure  (such  as  the  front  frame  or  a 
compressor  section)  provided  that  a  special  set  of  boundary  conditions  is  applied  along  the  axis 
where  the  phase  boundary  condition  is  not  defined.  The  implementation  of  the  phase  boundary 
condition  within  a  three-dimensional  FEM  solution  has  been  validated  and  the  scheme  can  now 
be  used  within  a  broader  implementation  with  the  goal  of  characterizing  real  engines.  As  a  con¬ 
sequence  of  the  limited  mode  phenomenon,  an  alternative  scheme  based  on  an  integral  equa¬ 
tion,  dyadic  Green’s  function  approach  was  introduced.  This  scheme  has  the  desirable  quality 
of  requiring  only  a  surface  mesh  (instead  of  a  volume  mesh)  and  requiring  no  phase  boundary 
conditions  since  the  angular  variation  of  the  solution  is  built  directly  into  the  method.  How¬ 
ever,  to  be  complete,  this  method  still  requires  the  analytical  extraction  of  the  dyadic  Green  s 
function  singularity. 
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Matrices  occupy  a  central  role  in  most  physical  modeling.  As  the  size  of  matrices  being  solved  increases, 
it  is  becoming  more  important  to  quantify  what  factors  influence  the  accuracy  ol  the  final  result.  The 
condition  number  (CN)  of  a  matrix  is  an  important  controlling  factor  in  limiting  die  solution  accuracy,  the 
effect  of  which  can  be  circumvented  by  increasing  the  precision  of  the  computations.  This  typically 
involves  going  from  single  to  double  precision,  e.g.,  going  from  64-bit  to  128-bit  word  size  But  the 
number  of  unknowns,  the  accuracy  to  which  the  original  matrix  cocllicienls  are  obtained,  and  the 
accuracy  to  which  the  right-hand-side  is  known  also  affect  the  final  result. 

This  discussion  reports  results  from  some  ongoing  computer  experiments  that  arc  being  conducted  with 
the  goal  of  acquiring  some  insight  into  such  questions.  The  computations  arc  performed  using  a 
compiled  BASfC  language  (Future  Basic)  Uiat  permits  varying  the  computer  precision  up  to  240  digits 
(or  more)  through  a  simple  configuration  command.  By  also  varying  the  matrix  size,  the  accuracy  to 
which  the  original  matrix  coefficients  are  computed  and  the  matrix  type  and  condition  number,  some 
quantiuitive  guidelines  might  be  developed  concerning  the  influences  of  these  factors  on  solution 
accuracy.  One  result  that  is  obtained,  not  new  in  this  study  but  which  is  consistent  with  previous 
findings  is  that  with  the  solution  accuracy,  SA,  the  computation  precision,  P,  and  CN,  all  expressed  in 
digits,  their  relationship  can  be  expressed  as  P  -  CN  <  SA  where  CN  is  one  of  the  estimates  commonly 
used,  such  as  the  ratio  of  maximum-to-minimum  singular  values,  and  when  CA  ~  P  is  tlie  coellicicnt 

accuracy.  Thus,  even  for  a  condition  number  of  10  it  is  possible  to  obtain  approximately  X  digits  ol 
solution  accuracy  if  P  and  CA  are  increased  to  -100  +  X  digits.  When  the  coefficient  accuracy  ol  tlie 
original  matrix  is  taken  into  account,  possibly  being  less  than  P,  the  above  result  becomes,  lor  the 
matrices  studied,  CA  -  CN  <  SA  <  CA,  i.e.,  coefficient  inaccuracy  can  counter  any  benefit  otherwise 
derived  from  increasing  the  compute  precision  and  vice-versa  if  CA  is  not  commensuratcly  increased. 
The  results  obtained  tlius  far  will  be  reviewed  and  their  implications  for  computational  electromagnetics 
(CEM)  will  be  discussed. 


INTRODUCTION 

Matrices  arise  in  myriad  ways  in  mathematics,  the  physicid  sciences  and  many  othei  aieas  ol  human 
activity  such  as  economics,  one  example  of  the  latter  being  financial  applications  where  spreadsheets 
applications  have  become  ubiquitous,  Matrices  are  found  useful  because  they  provide  a  way  of 
expressing  multi-variable  relationships,  those  where  an  outcome,  or  set  of  outputs,  depends  on  a 
weighted  combination  of  inputs.  In  the  most  general  sense,  tliese  relationships  might  be  compared  to  the 
state-transition  matrices  used  in  system  analysis.  Although  that  term  is  usually  associated  vYith  time- 
dependent  problems,  any  problem  where  a  convolution  relationship  occurs  can  be  described  in  an 
equivalent  hushion.  Such  matrices  comprise  the  connection  between  the  state  variables,  which  are  the 
independent  variables  for  a  given  problem,  and  the  system  state,  or  state  vector,  which  is  the  set  ol 
outputs.  In  CEM  problems,  where  the  usual  notation  has  [ZIT]  =  VJ  and  I]  =  |  Yl-V],  the  input  vector  is 
(usually)  the  exciting  field,  the  output  vector  is  the  set  of  induced  sources,  and  the  slate  variables  are  the 
spatial  (and  sometimes  temporal)  problem  variables  that  determine  the  system  matrix  fZ]  whose  solution 
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(inverse)  is  fYj.  A  CEM  matrix  represents,  in  a  sampled,  discretized,  and  approximated  form,  the 
electromagnetic  physics  of  the  problem  being  modeled,  tind  as  such  must  contain  all  the  information 
needed  for  the  subsequent  solution  if  physically  valid  results  are  to  be  obtained. 

It's  worth  noting  that  the  CEM  model  could  at  best  provide  an  exact  solution  to  a  rigorously  described 
physical  problem  hut  most  often  inu-oduces  physical  and  numerical  modeling  errors.  Even  if  an  exact  (or 
arbitnirily  accurate)  numerical  solution  were  to  be  obtained  to  the  CEM  model,  a  physical  modeling  error 
arises  because  tliat  model  is  an  approximate  representation  of  the  physical  problem.  Beyond  that,  an 
exact  numerical  solution  is  rarely  accessible  adding  a  further  numerical  mttdeling  enor  to  the  computed 
results.  Tlie  "art”  of  numerical  modeling  involves  reaching  appropriate  tradeoffs  between  the  various 
factors  that  affect  these  modeling  errors,  foremost  of  which  is  the  number  of  samples  or  unknowms,  N, 
and  the  sampling  dimensionality  relative  to  the  problem  dimensionality. 

In  computational  applications,  the  matrix  is  represented  as  a  set  of  numerical  coefficients,  of  w-hich  there 

iue  for  an  N’th-order  matrix.  When  most  of  these  coefficients  are  zero,  as  is  the  case  for  differential- 
equation  (DE)  models,  the  matrix  is  said  to  be  sparse,  whereas  when  all  of  the  coefficients  are  nonzero, 
as  happens  for  integral-equation  (IE)  models,  the  matrix  is  described  as  dense,  or  full.  In  the  trivial  case 
where  tlie  matrix  has  only  nonzero’s  on  the  diagonal,  interactions  between  the  outputs  is  zero  and  the 
matrix  is  solvable  algebraically  in  at  most  N  operations  as  is  the  case  for  FDTD  (Finite-Difference  Time 
Domain).  Otherwise,  a  general  solution  independent  of  the  input,  or  right-hand  side  (RHS),  generally 

requires  an  operation  count  (OC)  that  varies  between  and  depending  on  the  matrix  structure  and 
coefficient  count.  RHS-dependent  solutions  can  similarly  be  achieved  with  OCs  that  vary  from  KN  to 
KN-,  w'here  K  depends  on  a  variety  of  factors  including  the  matrix  structure  and  can  also  depend  on  N. 

A  common  ingredient  to  contend  with  whenever  seeking  a  solution  of  a  matrix  is  how'  well-conditioned, 
or  conversely,  how  ill-conditioned  that  matrix  may  be.  A  well-conditioned  matrix  has,  by  convention,  a 
CN  that  is  near  unity,  whereas  a  matrix  w'hose  condition  number  is  10^^^^  would  be  regarded  as 
extremely  ill  conditioned.  The  CN  indicates  how  sensitive  the  output  vector  will  be  to  errors  in  the  input 
vector  or  to  errors  in  the  matrix  coefficienLs  themselves.  Other  equivalent  statements  concerning  matrix 
conditioning  are  the  degree  to  which  any  errors  or  unceitainties  arc  amplified  in  the  solution  process,  or 
the  degree  to  which  information  is  lost  between  problem  description  and  problem  solution.  A 
conventional  numerical  sUitement  relating  the  CN(Z)  for  matrix  IZ]  and  solution  errors  is  that 


<  CN(Z)- 


■<CN(Z)- 


where  ll-ll  signifies  some  suitable  norm.  A  larger  CN  implies  that  a  more  accurate  problem  de.scriplion 
will  be  needed  accompanied  by  higher  computation  precision  in  order  to  maintain  a  specified  solution 
accuracy. 

Clearly,  the  coefficienLs  that  comprise  a  maU'ix  represent  a  collection  of  infomiation  that  is  tnmsfoiTncd 
during  the  solution  process  to  a  different,  but  equivalent,  collection  of  information  to  the  extent  that  no 
information  is  lost  during  the  solution  process.  The  process  of  inverting  a  matrix  can  at  best  preserve, 
but  not  add,  information  to  the  problem.  More  often,  information  might  be  expected  to  be  lost.  In  this 
iirticle  we  explore  some  ramifications  of  solving  matrix-based  problems  in  terms  of  how  solution 
accuracy  might  be  related  to  the  information  content  of  the  matrix  being  solved.  In  the  following,  we 
describe  the  experimental  methodology  and  the  error  measures  employed,  summarize  some  matrices  that 
might  provide  useful  test  cases,  and  present  some  preliminary  results. 


EXPERIMENTAL  METHODOLOGY 

It  is  generally  very  difficult  to  deduce  analytically  the  numerical  behavior  of  arbitrary  matrices.  Instead, 
aside  from  a  few'  special  cases  some  of  which  are  cited  below,  it  is  necessary  to  perform  computer 
“experiments”  designed  to  elicit  trends  and  patterns  about  the  expected  behavior  of  problems  of  interest. 


The  overall  strategy  used  here  is  to  “track”  information  flow  from  an  original  matrix  to  its  solution. 
Several  experimental  matrices  have  been  extracted  for  this  purpose  from  A  Collection  of  Matrices  for 
Testing  Computational  Algorithm  by  Gregory  and  Karney  (1969)  which  includes  a  collection  of 
matrices  having  CN’s  whose  solutions  are  analytically  expressible  and  some  of  whose  inverses  are  also 
available  in  analytical  form  .  Others  were  created  specifically  as  candidates  whose  CN’s  could  be  varied 
parametrically.  First  we  discuss  various  accuracy  metrics  of  matrix  solution  accuracy. 

Assessing  Solution  Accuracy 

There  are  numerous  ways  to  quantitatively  assess  the  accuracy  of  any  numerical  solution.  When 
the  specific  problem  involves  solving  matrices,  some  of  the  possibilities  include; 

1)  Comparing  true,  [Y,^],  and  computed  (approximate),  [Y^J,  inverses.  When  the  inverse  of 

a  test  matrix  is  known  either  analytically  or  can  be  computed  to  high  accuracy  compared  with  that 
expected  in  a  given  experiment,  then  this  “true”  inverse  can  be  used  to  determine  the  accuracy  ol 
the  computed  inverse. 

2)  Comparing  the  product  of  [ZJ  l  Y^^l  with  11),  the  identity  matrix.  Since  the  product  of  the 
original  matrix  and  its  inverse  should  equal  the  identity  matrix,  their  dilference  provides  a 
measure  of  the  inaccuracy  in  [  Yj^j. 

3)  Comparing  solutions  for  specified  RHS’s.  This  test  would  involve  comparing  fYJ  R| 

with  |Y,J  R1,  where  R|  is  a  RHS  test  vector,  where  many  different  RHS’s  could  be  u.sed.  For 
the  specific  RHS’s  given  by  successive  unit  vectors,  the  test  would  become  1)  above. 

4)  Comparing  fZJ  with  lY^]'*.  Inverting  [YJ'*  would  produce  [ZJ  in  the  absence  of  error 
and  so  provides  a  measure  of  the  solution  accuracy  that  results  from  the  two  inversion 
operations. 

5)  Comparing  the  eigenvalue  (EV)  or  singular  value  (SV)  spectra  ol  [ZJ  and  IZ.J,  where  [Z.J 
is  an  approximate  representation  of  fZJ  due  to  truncating  the  coefficients  ol  the  latter,  or  ol  [  YJ 
and  1Y,J,  respectively. 

For  the  results  presented  here,  the  SA  and  other  metrics  are  presented  as  digits  of  agreement  between  a 
reference  and  a  lest  result  on  a  per-coefficient  basis.  For  example,  in  using  test  (2),  we  compute  [D]  = 
IZtl  lYJ  -  |I)  where  [D]  is  a  difference  matrix,  and  then  find  =  Xdjj/N^  =  -[Xlogi()(Djj)l/N2,  i,j  = 
1,.,.,N  for  an  NxN  matrix  and  where,  if  Djj  =  0,  dy  is  set  to  the  compute  precision  P  used  for  that 

computation.  Using  a  digits  measure,  furthermore,  is  logical  from  the  perspective  of  information  theory 
as  the  “information  content”  represented  by  a  specific  outcome  can  be  expressed  as  a  logarithmic  measure 
of  the  set  of  all  possible  outcomes  in  units  of  bits  or  digits.  Thus,  the  quantitative  results  presented 
below  for  SA  can  be  intei-prcted  as  the  accessible  information  in  Z^  that  appears  in  a  Y.^  derived  from  it. 

Two  SA  metrics  are  employed  here,  one  SAj  ==  Yj^  -  Y^^  (the  “dillerence”  metric)  and  the  other  SAp  = 
Z^Y,^  - 1  (the  “product”  metric).  When  SA^^  is  lc.ss  than  the  P  or  CA,  the  implication  is  that  the  actual 

information  content  of  Z^  must  be  less  than  N^P  due  to  some  degree  ol  linear  dependency  among  the 
etjuations  which  comprise  it,  or  equivalently  that  information  is  lost  as  a  result  ol  roundoll  in  the  solution 
process.  It  seems  obvious  to  conclude  that  the  information  content  ol  Y^  not  only  cannot  exceed  that  ol 
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but  that  ihc  information  finally  obtained  in  Y,^  actually  defines  the  accessible  information  content  of  Z^. 
Also  observe  that  as  equation  linear  dependence  increases,  the  information-carrying  digits  move  lurthcr 
to  the  right  in  each  coefficient  so  that  a  given  number  of  digits  in  the  coefficients  in  Z^  then  represents 
less  actual  information.  Although  there  seems  to  be  no  obvious  reason  for  SAj  to  be  much  different 
from  SAp-  the  results  obtained  here  show  that  substantial  differences  ctin  exist  with  the  latter  generally 
exceeding  the  former. 

Candidate  Test  Matrices 

A  variety  of  matrices  that  might  be  used  for  the  kinds  computer  experiments  reported  here  are 
listed  below.  Also  included  ime  their  condition  numbers  and  inverses,  if  analytically  known.  Unless 
otherwise  stated,  the  coefficient  indices  are  summed  from  1  to  N,  where  N  denotes  the  matrix  order.  All 
analytic  results  for  CN’s  and  matrix  inverses  arc  from  Gregory  and  Karney  (1969). 

— Hilbert  matrix:  /jj  =  ^  _  |^  • 

This  matrix  has  a  CN  of  order  10 '  and  so  represents  a  challenging  problem  for  larger  N  for  testing 
matrix-solution  algorithms,  but  unfortunately  the  CN  cannot  be  varied  independent  ol  N.  The  Hilbert 
matrix  has  an  analytic  inverse  given  by 

-  (-l)‘^\N-f  i-  l)!(N-tj-  1)! 

(i+.i-  l)[(i-  O’Ci-  l)!l^(N-i)!(N-j)!' 


-Lotkin  matrix: 


/ 


t.i 


=  l,j=  1,...,N. 

,  i=  2,  3,  J  =  1, 


.N. 


(i-tj-  D’ 

The  Lotkin  matrix  CN  is  of  order  10^-'’Niog(N),  similar  to  that  of  the  Hilbert  matrix  to  which  its 
coefficients  are  identical  except  for  the  first  row.  Its  inverse  is  also  known  analytically  and  is  given  by 

^N-if  N+  i  -  1  VN  ' 


Yi.i  =(-n 


y,.^i  =  (-• 


,  i  =  1,2,. ..,N, 


N-fi  lyN-t  JY  1.2,...,N,  j  =  1,2,...,N-1. 


J+.l 


i+j 


-Matrix  of  random  numbers: 

zj  j  =  Xp  for  0  <  x^  <  1  or  - 1  <  Xj.  <  1  with  x^  uniformly  distributed. 

The  CN  of  a  random  matrix  evidently  is  not  derivable  analytically,  but  the  geometric  mean  of  the  CN’s  ol 
a  large  collection  of  real  square  matrices  of  normally  distributed  random  numbers  has  been  determined  to 
be  of  order  4.65N  [Edelman  (1989)1. 

-Matrix  of  random  numbers  with  last  Nil  equations  nearly  parallel: 

zj  j  =  Xp  i  =  1,2,...,N-NII;  /j  j  =  lO-yx^  +  (1- l()->')/.i. i  j,  i  =  N-NI[+ 1  ,...,N. 

This  mauix  is  generated  to  have  Nil  equations  whose  degree  of  linear  dependence  is  determined  by  the 
parameter  y,  and  produces  a  singular-value  CN  -lOy,  where  the  number  of  small  singular  values  is  Nil  - 
1.  Varying  NIi  and  y  provides  a  way  to  tailor  tlie  eigenvalue  spectrum  and  also  to  vary  the  CN. 

-Liniiarv  matrix  of  random  numbers: 

Z|  j  =  1.  +  Xp  1  -  1,2,...,N-N!h  /.j  j  =  1.  +  lO-yXp  i  =  N-NII+L...,N. 
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Wc  might  anticipate  that  the  CN  of  such  a  matrix  would  also  be  of  order  10^  since  information  about 
coefficient  differences  is  approximately  y  digits  to  the  right  of  the  decimal.  The  eigenvalue  spectrum  of 
this  matrix  can  also  be  varied  through  the  values  chosen  for  Nil  and  y  and  letting  y  be  a  function  of  i. 

--The  Rump  matrix: 

A  class  of  very  ill-conditioned  matrices,  having  CNs  varying  from  IfP^^  to  10^^'^,  and  exactly 
representable  in  Hoating-point  arithmetic  is  described  by  Rump  (1991).  The  example  used  here  has  a  CN 

-  10'^  in  a  6x6  matrix  and  is  given  by 

3257199-2^  6746489-2'  -8816797-2"  1247053-2'^  1350835 1-2'  -14061827-2^ 

1247053-2^  13508351-2^  -14061827-2'  3527199-2^  67464892'  -8816797-2" 

1  -2^^  0  0  0  0 

0  1  -2^^  0  0  0 

0  0  0  1  -2^''  0 

0  0  0  0  1 


-Diagonal,  unitary  matrix:  Zj  j  =  1.  for  i  ^  /.^  j  =1.4-  lO'y. 

Again,  this  mau-ix  could  be  expected  to  have  a  condition  number  of  order  loy  and  its  eigenvalue 
spectrum  varied  by  letting  y  be  a  function  of  i. 

— Binomial,  lower-triangular  matrix:  Zj  j  =  (- 1  j  j  for  J  <  i,  Z;  j  =  0  otherwise. 

This  matrix  has  a  CN  ~  exp(4Nlog2)  and  its  inverse  is  equal  to  tlie  original  matrix. 

-Circulant  matrix  of  random  numbers: 

Cj  j  =  Xj,  Cj  i  =  Cj  j’,  where  j’  =  N-i-i-l-f-j  (-N  is  j’  >  N) 

-Toeplitz  matrix  of  random  numbers: 

^l,j  ~  ‘"i,!  “  ^p  ^’i,J  “  ^i-l,j-l>  ’’-i  " 


Condition-Number  Estimates 

A  variety  of  CN  measures  have  been  employed  to  lest  matrix  ill-conditioning,  Gregory  and 
Karney  (1969)  discuss  several,  some  of  which  are  summarized  below.  Perhaps  the  most  widely  used 

CN  estimates  employ  the  ratio  of  maximum  to  minimum  EV’s,  X,  or  singular  values  SV’s,  w,  of  the 
given  matrix,  as  expressed  by 

max  1  1  max  |  W|  | 

CN,v  =  ‘  ,  ""r  - r. 

min  I  A,j  |  min  |  W;  | 

i  ' 

'I'wo  other  CN’s  discussed  by  Gregory  and  Karney  are 


-max  I  y; 
ij 


and  CN„ 


IIIZl 

where 
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Il  is  observed  by  Gregory  and  Karney  that  CN^^,  CN^.^  and  do  not  differ  much  from 

and.  in  particular  [Taussky  and  Todd  (1952)  iuid  Westlake  (1968)1  that 


They  further  mention  that  for  random  matrix  elements  from  a  normal  population,  ~  (N|  *^^log(N) 

and  CNpn,j.jj^  ~  1N|  These  values  are  more  optimistic  than  the  result  obtained  by  Edelman  from  his 
computer  experiments  where  N.  Finally,  they  observe  [see  for  example  Golub  and  Van  Loan 

1983)]  that  the  estimate 

CNKin.  =  II|Z|IMI|Y|||  =Kmf 

known  as  the  Kj^^j  CN  is  more  often  used  where  the  norms  for  [Zj  and  [Y]  are  not  necessarily  the  same. 
Here,  we  use  [Golub  and  Van  Loan  ( 1983)] 

Kinf  =  max^  1  Zj  i  I  ‘niax^  I  yj  j  t  - 

I  j  4mfmi 

J  J 

Among  the  vector  norms  that  have  been  employed  tu'e  the  p-norm 

||i||||,  =  (ii,||’  +  ii2I''+-"  +  iinI'’)''|' 

which  for  p  =  1  or  2  is  called  the  Ij-norm  and  the  Eucledian  or  l2-norm.  respectively,  and  the  infinity- 
nurm 

!l  I|  IL  =  maxllil. 

j 

Two  other  examples  of  matrix  norms  are  given  by 

II  [Z1  li  =  maxiijp-i^  and  ||  [Z]  ||  =  maX||x]||^i  li  [Zi-x]  ||  . 

II  x]  II 


Of  the  CNs  above,  and  have  been  used  in  our  experiments.  A  possibly  more 

relevant  a  posteriori  estimate  of  the  effective  CN  of  a  matrix  might  be  provided  by  determining  the 
accuracy  actually  achieved  in  .solving  the  matrix  as  compared  with  the  “true”  result  as  mentioned  above. 
The  difference  between  the  computed  and  true  results,  from  which  SA  is  found,  can  also  directly  provide 
tm  estimate  for  CN  by  simply  determining  the  difference  between  P  and  SA,  This  approach  has  also 
been  employed  in  the  following  computations  with  the  result  denoted  as  CN^j-p  Note  that  all  results 

presented  in  units  of  digits  are  on  a  per-coefficient  basis.  For  example,  in  comparing  tw'o  values  for  a 
matrix.  Yj  and  Y^.  the  total  digits  of  agreement  between  their  N'^  coefficients  is  divided  by  N'^  to  obtain  a 
normalized  per-coefficient  results.  The  SVInt  was  evaluated  by  assuming  a  “precision  floor”  exists  P 
digits  below  the  Imgest  SV  and  summing  the  number  of  digits  contributed  by  each  SV  relative  to  this 
floor. 
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NUMERICAL  RESULTS 

Expcrimcnl.s  were  conducted  with  many  of  the  matrices  outlined  above,  but  results  iire  presented  here  tor 
just  a  few  of  these,  the  Hilbert  matrix,  the  random  matrix  with  Nil  nearly  parallel  equations,  and  the 
Rump  matrix. 

The  Hilbert  Matrix  (HM) 

Because  of  the  high  condition  number  it  yields  for  relatively  few  equations,  the  HM  provides  a 
useful  test  case  for  studying  conditioning  effects.  Computations  using  the  HM  were  performed  as  a 
function  of  P,  N,  and  CA.  Results  for  SA  are  presented  SA  in  Fig.  I  as  a  function  of  CA  with  N  a 
parameter  and  with  P  =  64  digits.  It  can  be  seen  than  the  SA  tor  each  metric  declines  linearly  with 
decreasing  CA,  but  that  the  difference  between  them  increases  progressively  as  N  geus  larger.  The 
curves  in  Fig.  2  exhibit  the  product  metric  and  integral  of  the  singular-value  spectrum  (SVlnt)  lor  the 
HM  for  CA  -  P  as  a  function  of  N.  The  slope  of  SVlnt  is  about  one-half  that  of  the  SA  product  metric, 
with  the  former  decreasing  as  1/2N  and  the  latter  as  1/N,  so  that  approximately 
SVlnt  ~  P  -  N/2  digits  and  SAp  ~  P  -  N  =  SVIntx  -  N/2  digits. 


Random  Matrix  (RM)  , 

Similar  computations  were  conducted  for  a  matrix  consisting  ol  random  numbers  umlormly 
distributed  between  -1  and  -hi  with  Nil  and  CA  systematically  varied  and  with  P  =  48  digits.  Note  that 
for  this  problem,  increasing  Nil  plays  the  same  role  as  increasing  N  for  the  HM  in  terms  ol  increasing  the 
CN  Results  arc  presented  in  Fig.  3  for  the  difference  and  product  metnes  as  a  function  of  CA  lor  Nil 
having  the  values  0,  6,  1 1.  16,  21, 26  and  30,  and  with  y  =  10.  Results  obtidned  here  exhibit  some 
distinct  differences  from  the  HM.  Although  the  SA  decreases  linearly  in  proportion  to  the  CA,  the  two 
metnes  behave  quite  differently.  For  Nil  =  0,  where  the  CN  is  near  unity,  both  metrics  arc  close  to  the 
CA  However,  when  Nil  has  a  non-zero  value,  the  difference  metrics  arc  ncaily  equal  independent  ol  the 
specific  value  of  Nil  and  are  given  by  SAj  ~  CA  -  CN.  The  product  metrics,  on  the  other  hand,  are  given 
by  SAp  ~  CA  -  (NII/N)CN.  ResulLs  for  SVlnt  and  SAp  as  a  function  of  Nil  are  slmwn  in  Fig.  4  for  P  = 

CA  =  24  where  it  can  be  seen  that  these  metrics  lae  nearly  equal  as  Nil  is  increased,  varying  with  N 
approximately  as  SAp  ~  P  -  (Nll/N)y  ~  SVlnt  digits.  Thus,  on  the  basis  of  these  two  quite  dillerent 

cases,  we  might  tentatively  conclude  that  SVlnt  and  SAp  satisly  a  relationship  like  SVlnt  -  N/2  <  SAp  < 
SVlnt  which  encompasses  a  very  broad  range  of  values. 

Condition  Numbers  and  Singular  Values  r  u  ) 

It  is  instructive  to  examine  the  behavior  of  some  of  the  CNs  mentioned  above  lor  the  HM  and 
RM  as  the  ill-conditioning  of  these  matrices  is  increased,  as  included  in  Figs.  5  and  6.  Fc^  both  matnccs 
the  Kinf  CN  is  the  largest,  while  CN^.f|-,  as  determined  by  P  -  SAp,  is  the  smallest,  with  CN^^  lying  in 

between  Most  interesting  is  the  fact  that  for  the  RM  and  with  Nil  =  2  and  y  -  10,  which  produces  one 
small  singular  value,  is  little  different  from  that  for  Nil  =  0  whereas  Kinl  and  CN^-y  are  nearly 

equal  to  their  maximum  values  for  this  case.  As  Nil  is  increased,  CN^^jy  increases  ~  (Nll/N)y. 


Singular-Value  Spectra  for  the  Hilbert  and  Rump  Matrices 

If  the  SVs  (or  EVs)  of  a  matrix  are  found  to  decrca.se  monotonically  out  to  some  point  and  then  to 
stabilize  at  a  neaiiy  constant  value,  this  result  may  be  indicative  ol  reaching  a  compuUition 
precision  floor.  An  example  of  this  is  shown  in  Fig.  7  where  the  SVs  ol  a  Hilbert  matrix  o  1  N  -  30  are 
plotted  with  the  CA  a  parameter  which  ranges  from  8  to  40  digits  and  with  P  =  64  digits.  The  spectrum 
for  each  value  of  CA  becomes  essentially  constant  at  the  x’th  SV  and  beyond  when  (in 

digits)  approximately  equals  CA. 

A  quite  different  result  is  shown  in  Fig.  8  where  the  SVs  of  the  N  =  6  Rump  matrix  and  of  two  RMs, 
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(DIGITS) 

Fig.  1.  Solution  accuracy  for  Hilbert  matrices  of 
size  N  as  a  function  of  coefficient  accuracy.  Solid 
circles  are  for  the  difference  metric  and  the  open 
squares  are  for  the  product  metric  of  SA. 


NUMBER  OF  UNKNOWNS 


Fig.  2.  Integral  of  the  SV  spectrum  and  SA 
product  metric  as  a  function  of  size  for  the  Hilbert 
matrix  for  P  =  64  diuiLs.  SA  varies  approximately 
as  1/N  while  SVInl  varies  as  1/2N. 


(DIGITS) 

Fig.  3.  Solution  accuracy  as  a  function  of  coeffi¬ 
cient  accuracy  for  matrix  of  random  numbers 
with  number  of  ne;irly  parallel  equations  (first 
ten  digits  equal )  a  parameter. 


EQUATIONS 

Fig.  4.  Integral  of  SV  spectrum  and  SA  product 
metric  as  a  function  of  Nil  for  a  matrix  of  random 
numbers.  Both  the  SA  and  SVlnt  vary 
approximately  as  1/NII. 


also  of  N  =  6.  are  plotted.  The  RMs  each  had  Nil  =  2  with  y  =  44,  cho.scn  to  approximate  the  behavior 
of  the  Rump  matrix  whose  CN  is  about  10'^.  The  first  used  normalized  coefficients  while  the 
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coefficients  of  the  second  were  multiplied  by  the  magnitude  of  the  largest  coefficient  of  the  Rump  matrix 
to  equilibrate  their  SV  spectra.  Although  the  SV  spectra  for  the  Rump  matrix  and  the  second  RM  arc 
nearly  identical,  the  behavior  of  their  respective  SA  metrics  were  found  to  be  much  different  The  RM 
exhibits  a  dependence  on  CA  like  that  already  shown  in  Figs.  3  and  4  whereas  the  Rump  matrix  seems 
much  more  .sensitive  to  CA  in  a  fashion  similar  to  the  HM  as  is  shown  in  Fig.  9.  Apparently  matrix 
structure  or  coefficient  pattern,  which  is  much  more  organized  in  the  case  of  the  HM  and  Rump  matnx 
than  is  the  case  for  the  RM,  has  an  influence  that  is  not  revealed  by  CN  or  SV  spectra  alone. 

Solution  Accuracy  and  Singular-Value  Integrals 

The  SVInt  and  SA  were  shown  above  to  be  related  above  for  the  HM  and  RM  although  in 
significantly  different  ways,  where  for  the  HM  we  found  SAp  ~  SVIntx  -  N/2  digits  while  lor  the  RM 
we  had  SAp  ~  SVInt  digits.  It  is  informative  to  see  how  these  SA  metrics  and  SVInt’s  depend  on  CA  lor 
the  HM  and  RM,  some  results  for  which  are  shown  in  Figs.  10  and  1 1  respectively.  There  we  ob.serve 
that  the  close  correlation  between  the  SVInt  and  SA  is  maintained  for  the  RM  with  N  =  30  ior  Nil  =  0  and 
Nil  =  30,  where  using  y  =  10  in  the  latter  case  produces  a  CN^^jj  ~  lO'^^.  For  the  HM,  on  the  other 
hand,  the  SA  and  SVInt,  while  also  showing  a  linear  dependency  on  CA,  are  increasingly  different  from 
each  other  as  N  increases  with  the  SVInt  always  being  larger.  Thus,  the  SVInt  appears  to  provide  a 
meaningful  measure  for  the  expected  SA  of  a  RM  over  a  wide  variation  of  CNs  while  lor  the  HM  the 
SVInt  differ  from  the  SA  by  an  amount  dependent  on  N,  hence  dependent  on  the  CN.  In  both  cases,  the 
SA  is  less  than  the  CA  by  an  amount  that  defines  the  effective  CN. 


CONCLUDING  COMMENTS 

Perhaps  the  most  significant  result  obuiined  in  this  study  that  is  relevant  to  CEM  applications  is  that 
coefficient  accuracy,  CA,  must  increase  in  proportion  to  the  matrix  condition  number,  CN,  to  maintain  a 
desired  solution  accuracy,  SA,  all  expressed  in  digits.  If  the  CN  of  a  CEM  matrix  increases  as  some 
function  of  the  number  of  unknowns,  say  f(N),  then  this  means  that  CA  >  SA  +  CN  =  SA  +  f(N).  Since 
N  is  invariably  larger  for  differential-equation  (DE)  models  except  for  problems  that  involve 
inhomogeneous  media,  this  implies  that  CA  would  need  to  be  proportionately  larger  for  DE  models, 
everything  else  being  equal. 

Another  result  worth  noting  is  that  the  effective  CN,  i.e.,  the  dillerence  between  compute  preci.sion.  P, 
and  SA  may  turn  out  be  much  less  than  the  standard  estimates  for  CN  indicate.  A  related  result  is  that  the 
product  metric  for  SA,  given  by  (ZJ  (Y^i  -  [1],  generally  always  exceeds  that  obtained  from  the 
difference  metric,  |  Y^)  -  lY^l,  where  the  subscripts  “a”  and  “t”  denote  the  true  and  approximate  values  of 
the  original  (or  |Zj)  and  inverse  {or  fY])  matrices.  This  result  might  be  explained  by  noting  that  each 
coefficient  in  the  product  metric  fZ^j  fY^l  comes  from  N  multiplies  and  additions  with  the  possibility  that 
enors  in  individual  coellicients  will  cancel  while  those  in  the  dillerence  metric  [YJ  -  I Y^]  are 
individually  accounted  for. 

Finally,  we  note  that  an  integral  of  the  singular-value  (SV)  spectrum,  SVInt,  for  a  matrix  whose 
coefficients  are  random  numbers,  even  one  with  two  or  more  nearly  parallel  equations,  yields  a  result 
that  is  close  to  the  achieved  SA,  when  both  results  are  expressed  in  digits.  This  is  in  contrast  to  what  is 
found  for  the  Hilbert  matrix,  where  Uie  SVInt  exceeds  the  SA  by  an  amount  apparently  determined  by  the 
matrix  CN.  Even  when  the  CN  and  SV  spectra  are  es.sentially  identical,  as  wits  found  for  the  random 
matrix  and  Rump  matrix,  the  dependence  of  their  SA  on  the  CA  was  found  to  be  quite  different. 
Apparently  matrix  structure,  which  is  much  more  ordered  for  the  Hilbert  and  Rump  matrices  than  tor  the 
random  matrix,  also  plays  an  important  role  in  the  SA  achieved.  Further  study  will  be  needed  to  test 
the.se  tentative  conclusions  for  a  wider  variety  of  matrices,  most  importantly  including  actual  CEM 
examples. 
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NUMBER  OF  EQUATIONS 

Fig.  fi.  CNs  I'or  Hilbert  matrix  as  a  function  of 
matrix  si/c.  is  consistently  (bund  to  be 

lowest  while  Kinf  is  highest,  with  all  increasing 
with  matrix  si/e,  as  expected. 


SINGULAR  VALUE  NUMBER 


Fig.  7.  SV  spectra  for  N  =  30  Hilbert  matrix 
with  coefficient  accuracy  a  parameter  with  P  =  64 
digits.  As  llte  spectrum  falls  to  the  floor  defined 
CA.  it  then  stabilizes  at  a  nearly  constant 
value. 


PARALLEL  EQUATIONS 

Fig.  6.  CNs  for  random  matrix  as  a  function  of 
Nil.  While  Kinf  and  CN^.^  both  become  large 

with  Nll=  2,  CN^,pj'  starts  at  a  small  value  and 
increases  monotonically  with  Nil. 


SINGULAR  VALUE  NUMBER 


Fig.  8.  SV  spectra  for  the  Rump  matrix  and  for 
a  random  matrix  having  Nil  =  2  with  y  =  44  digits 
with  N  =  6  for  both.  The  coefficients  of  the 
“amplified”  RM  are  multiplied  by  the  magnitude  of 
the  Rump  matrix  largest  coelTicient  to  equtilize  their 
maximum  SVs. 
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COEFFICIENT  ACCURACY 

Fig.  9.  SVInt  and  SA  mclrics  for  the  N  =  6  Rump  and  random  matrices  as  a  function  ol  CA. 
The  upper  curves  displays  the  difference  metric  for  the  true  and  approximate  original  matrices 
while  the  next  shows  the  SVInt  for  both,  with  nearly  identical  results  for  each  matrix.  The 
difference  metrics  for  SA  of  both  matrices  arc  vei7  similar  but  their  product  metnes  are  vei7 
different. 


COEFFICIENT  ACCURACY  (DIGITS) 


Fig.  10.  SAp  and  SVInt  as  a  function  of  CA  for 
Hilbert  matrix  of  variable  N.  Triangles,  squares 
and  circles  are  for  N  =  10,  20  and  30  respectively, 
while  dotted  lines  depict  SVInt  and  solid  lines 

depict  SAp. 


COEFFICIENT  ACCURACY  (DIGITS) 


Fig.  1 1.  SAp  and  SVInt  as  a  function  of  CA  for  an 
N  =  30  random  matrix  of  variable  Nil.  Solid 
squares  are  for  Nil  =  0  and  solid  circles  are  for 
Nil  =  30,  while  dotted  lines  show  SVInt  and  solid 

lines  show  SAp. 
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Numerical  Accuracy  Issues  in  Finite  Element  Frequency  Domain 
Solutions  of  Radar  Scattering  Problems 

John  D’Angelo 
C+AES,  Inc. 


Abstract 

This  paper  discuss  issues  regarding  the  numerical  accuracy  of  finite  element  frequency  domain 
methods  for  scattering  problems.  These  issues  are  also  pertinent  to  other  partial  differential  equation 
(PDE)  methods  such  as  finite  difference  and  finite  volume.  The  areas  discussed  are  dispersion  error, 
the  order  of  the  approximation  function,  modeling  the  singular  behavior  of  the  field  at  sharp  corners  of 
perfect  electric  conductors  (PECs),  and  the  convergence  criteria  of  the  iterative  solvers. 

Introduction 

Pai  tial  differential  equation  (PDE)  methods  (finite  element,  finite  difference,  and  finite  volume)  are 
inherently  suited  for  solving  large,  complex  scattering  problems.  This  characteristic  is  mainly  due  to 
their  local  nature,  i.e.  unknowns  in  a  PDE  discretized  space  are  only  coupled  to  neighboring 
unknowns.  The  converse  is  true  for  method  of  moment  or  boundary  element  techniques,  which  ai'e 
global  methods,  where  each  discretized  unknown  is  coupled  to  all  others.  Local  techniques  result  in 
sparse  systems  of  equations  which  are  computationally  efficient  in  terms  of  computer  memory  and 
floating  point  operations.  This  is  especially  true  for  problems  containing  numerous  bulk  material 
regions  or  if  a  material  coating  cannot  be  accurately  modeled  by  a  surface  impedance  approach. 

However,  with  this  local  nature  advantage  of  PDE’s,  challenges  arise  in  their  numerical  accuracy. 
Dispersion  error  is  a  primary  concern.  PDE  methods  do  not  have  the  advantage  of  a  Green’s  function, 
as  do  global  methods,  to  help  maintain  phase  accuracy  in  the  solution.  The  wave  behavior  with  a  PDE 
technique  is  model  by  a  volume  discretization  and  the  phase  and  group  velocity  of  the  electromagnetic 
field  solution  is  numerically  dependent  on  the  mesh  density,  the  direction  of  wave  through  the  mesh, 
and  the  order  of  the  approximation  function  used  to  model  the  field  unknown.  Also,  as  shown  in  [1,2], 
the  dispersion  error  is  also  dependent  on  the  wavelength  size  of  the  scatterer. 

Also  discussed  in  this  paper  is  the  need  for  accurate  modeling  of  the  singular  field  behavior  at  sharp 
metallic  corners,  and  the  level  of  convergence  needed  in  the  iterative  solution  of  finite  element 
frequency  methods  for  RCS  calculation. 

Analysis 

The  following  is  a  numerical  study  of  the  error  in  finite  element  solutions  of  RF  scattering  problems. 
The  first  numerical  experiment  will  use  a  two-dimensional,  field  analysis  of  the  scalar  Helmholtz 
equation 

+  =  O  (1) 

Here,  the  finite  element  method  will  be  used  to  model  a  free-space  section  with  a  plane  wave  traveling 
through  it  in  the  -X  direction.  The  finite  element  solution  will  then  be  compared  to  the  exact 
representation  of  the  plane  wave.  In  (1),  (p  is  either  the  total  H  field  for  TE  polarization  or  the  total  E 
field  for  TM  polarization  (H  and  E  are  both  Z  directed.) 

Equation  (1)  is  discretized  in  finite  element  form  using  both  first,  second  and  third  order 
approximation  functions  for  comparison.  The  finite  element  solutions  will  be  truncated  by  a  second 
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order  Bayliss-Turkel  absorbing  boundary  condition  (ABC)  [3],  Equation  (1)  is  discretized  by  first 
using  a  weak  Galerkin  weighting  method. 


Jv/(v2,j  +  ijf0)f/y  =  O  (2) 

where  y/  is  the  arbitrary  weighting  function  and  V  is  the  solution  domain.  Equation  (2)  is  then 
integrated  by  parts 


jWy/W(p~k^y/(p\dV-\\\j/^]dS  =  0  (3) 

In  (3),  S  is  the  exterior  surface  of  the  finite  element  solution  domain  where  the  Bayliss-Turkel  ABC  is 
applied.  Here,  the  forcing  function  of  the  plane  wave  is  applied  through  the  ABC  -  the  Bayliss-Turkel 
ABC  can  be  described  by  an  operator  on  the  scattered  field  unknown,  i.e. 


(4) 


In  (4),  (p^  is  the  scattered  field,  0  =  4-  0^  .  0,-  is  the  incident  plane  wave,  i.e.,  0j  =  e  .  Using  (4) 

in  (3)  results  in 


l\vv  -  l[vA<t>]dS  =  -  ¥A4’i]dS 

‘  S  s'-  (5) 

To  examine  the  effect  of  electiical  problem  size,  in  this  example,  the  free-space  region  will  be  examine 
for  two  problem  sizes.  The  first  has  a  length  of  four  wavelengths  from  one  end  to  the  other,  the  second 
has  a  length  of  ten  wavelengths.  The  model  will  be  first  meshed  at  a  rate  of  approximately  10  nodes 
per  wavelength.  To  examine  the  effect  of  the  order  of  the  approximation  function,  first  to  third  order 
elements  will  be  used.  The  element  types  consist  of  both  triangular  and  quadrilateral  shaped  elements. 
The  triangular  elements  are  of  the  Lagrangian  types  whereas  the  quadrilateral  are  of  the  Serendipity 
type.  The  total  number  of  nodes  for  each  case  will  be  kept  nearly  constant  for  the  three  approximation 
orders. 

For  the  four  wavelength  size  problem,  the  total  number  of  nodes  is  approximately  equal  to  1300. 
Figures  I  (a-c)  shows  the  en  or  for  the  three  types  of  elements.  The  error  is  defined  here  by 

Enot  =  [0  “  0f,vuc/| 

Table  1  shows  the  peak  error  for  the  three  approximations  function  for  this  case. 

As  shown  in  Table  1,  the  error  decreases  dramatically  by  6.23  times  from  first  order  to  second  order 
and  13.4  times  from  first  to  third  order.  This  illustrates,  as  also  described  in  [1,2],  that  dispersion  error 
is  large  for  first  order  approximation  functions  and  should  be  avoided.  Second  and  third  order 
approximation  functions  prove  to  be  better  suited  for  wave  equation  problems.  Table  2  shows  the 
maximum  error  for  approximately  20  nodes  per  wavelength  (twice  the  number  of  elements)  for  the 
same  problems.  Table  2  shows  that  the  error  for  the  first  to  third  approximation  functions  behaves  as 
0(h2),  0(h3),  and  0(h'^),  respectively,  as  theory  predicts. 
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Approximation  Function  Order 

maximwm  \<p  -  (pexactl 

first 

0.4439 

second 

7.1240x10-2 

third 

3.3141x10-2 

Table  1 

Maximum  error  versus  Approximation  function  order  for  the  four  wavelength 
example  with  denser  mesh  -  ~10  nodes  per  wavelength 


0.<1<138£.4 

0.199503 

o,3b-:,4V3 

0,3112a: 

0.2570?; 

0.: 22901 

o.na'no 

0.13.1520 

9.032941E-02 

4.5138a8E-02 

1.94B342E-03 


7.12402CE'02 


5.419616E-02 

5.715212E-02 

5.oioao«E-o? 

4.306404E-02 

3.602000E-02 

2.097595E-O2 

2,193192E-02 

1.48E780E-O2 

7.B43844E-03 

7.998043E-04 


Figure  1  (a) 

Error  for  the  four  wavelength  case  using  first  order 
approximation  functions 


Figure  1  (b) 

En-or  for  the  four  wavelength  case  using  second 
order  approximation  functions 


1.31.1965E-02 

2.986510E-02 


2.65G956E-02 
2.331.101E-02 
2.003B47E-02 
1.676292E-02 
1. 3.187  37E -02 
1.02nB3E-02 
5.9353e:E-03 
3.t;W1735E-03 
3.a51887E-C4 


Figure  1  (c) 

Error  for  the  four  wavelength  case  using  third 
order  approximation  functions 

Also  shown  in  figures  l(a-b)  is  the  dispersion  error  is  larger  in  the  fai-  end  of  the  solution  domain,  with 
respect  to  the  incident  field  direction,  and  smaller  at  the  near  end.  This  occurs  for  the  first  order  and 
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second  order  approximation  functions,  for  the  third  order  approximation  function  the  error  is  evenly 
distributed.  This  was  also  ob.served  in  [1], 


Approximation  Function  Order 

maximum  \(p  -  (pexactl 

first 

0.1167 

second 

5.3223x10-3 

third 

2.3450x10-3 

Table  2 


Maximum  error  versus  Approximation  function  order  for  the  four 
wavelength  example  with  denser  mesh  -  -20  nodes  per  wavelength 

Another  aspect  is  the  error  versus  problem  size.  Table  3  shows  the  maximum  error  for  the  ten 
wavelength  case  with  a  discretization  rate  of  approximately  10  nodes  per  wavelength,  showing  a 
marked  increase. 


Approximation  Function  Order 

maximum  \(j)  -  (p^xacd 

first 

1.144 

second 

0.174 

third 

3.494x10-2 

Table  3 


Maximum  error  versus  Approximation  function  order  for  the  ten  wavelength  example 

In  figures  2  (a-b)  the  error  is  shown  for  another  four  wavelength  case.  Here  the  outer  boundary  and 
mesh  is  rectangular.  The  ABC  is  an  Engquist-Majda  type  [4]  and  a  discretization  rate  of  10  nodes  per 
wavelength  is  used  with  second  order  elements.  Figure  2  (a)  has  an  incident  field  in  the  -X  direction  (0 
degrees)  and  Figure  2  (b)  at  45  degrees  incident. 


Figure  2  (a)  Figure  2  (b) 

Error  for  the  rectangulai’  four  wavelength  case  Error  for  the  rectangular  four  wavelength  case 
using  with  0  degrees  incidence.  using  with  45  degrees  incidence. 
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Here,  the  error  is  less  at  45  degrees  incidence  is  less  than  zero  degrees.  This  can  be  attributed  to  the 
increase  in  the  effective  order  of  the  approximation  function.  At  45  degrees  for  this  regular  inesh,  the 
fields  traverse  the  elements  along  their  diagonal.  Along  the  diagonal,  the  approximation  function  is  bi¬ 
quadratic  whereas  along  the  horizontal  axis  it  is  quadratic. 

Another  aspect  is  the  convergence  criteria  of  the  iterative  solver.  Iterative  solvers  are  used  here  for 
three-dimensional  finite  element  problems  because  of  their  low  demands  on  computer  memory  and 
operation  count  when  compared  to  direct  solvers.  The  solver  used  here  is  the  Quasi-Minimal  Residual 
(QMR)  type  [5j.  it  is  shown  here  that  the  convergence  criteria  for  the  iterative  solver  need  not  be 
exceedingly  strict  for  obtaining  accurate  values  of  RCS.  The  convergence  criteria  for  solving  a  matrix 
equation,  Ax  =  b,  is  called  the  residual.  The  residual  is  usually  defined  by: 

Residual  = 

1^1 

The  norms  in  (7)  are  either  the  or  the  Uo  norms.  Here,  a  square  PEC  plate  will  be  used  as  an 
example,  This  plate  is  square  with  a  dimension  of  five  wavelengths  per  side  and  a  monostatic  solution 
for  an  elevation  cut  will  be  used.  Figure  3  shows  the  RCS  calculated  with  the  Ij^  nonn  residual  set  at 
0.01  and  0.001.  Figure  3  shows  the  typical  result  that  the  residual  need  not  be  a  exceedingly  low  value 
to  obtain  correct  RCS  results.  The  relatively  high  residual  of  0.01  produces  nearly  identical  results 
while  typically  requiring  a  third  to  half  the  number  of  iterations  as  the  0.001  criterion.  Figure  4  shows 
the  comparison  of  the  finite  element  method  with  a  GTD  technique  [61. 


Angle 


Figure  3 

Monostatic  RCS  comparison  for  squaie  five  wavelength  PEC  plate, 
elevation  cut.  Comparing  different  convergence  critereons. 
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Figure  4 

Monostatic  RCS  compai'ison  for  square  five  wavelength  PEC  plate, 
elevation  cut.  Compaiing  the  finite  element  method  with  GTD. 

Another  aspect  of  RF  scattering  analysis  is  modeling  the  singular  behavior  of  the  electromagnetic  field 
at  sharp  PEC  corners.  It  is  shown  here  that  for  a  node  based  formulation,  special  handling  of  corners  is 
required.  Node  based  elements  are  used  instead  of  “edge-based”  elements  because  of  they  require 
much  less  computer  resources  and  allow  the  solution  of  larger,  more  practical  problem  sizes  [7,8]. 

However,  although  node-based  element  are  efficient,  difficulties  arise  when  sharp  comer  are  present. 
To  improve  the  results  for  the.se  cases,  techniques  to  include  edge-based  functions  at  the  comers  with 
node-based  everywhere  else  were  developed.  Edge-based  elements  have  the  advantage  in  modeling 
corners  becau.se  they  u.se  tangential  component  of  the  field,  thereby  avoiding  the  ambiguous  normal 
corner  direction,  and  have  more  degrees  of  freedom  to  aid  in  modeling  the  rapidly  changing  field. 

Figure  5  shows  the  monostatic  RCS  for  a  small  PEC  cube  using  a  node-ba.sed  formulation  and  a 
combined  node/edge  method.  Using  the  node/edge  method  improves  the  overall  solution. 

Conclusion 

Accurate  modeling  of  .scattering  using  finite  element  methods  is  an  ongoing  technology.  Re.sults  show 
that  using  higher  order  basis  functions  dramatically  improves  the  accuracy.  Also,  as  problem  sizes 
increase,  denser  meshes  are  required  to  maintain  the  same  level  of  accuracy  as  smaller  problems. 

Also  shown  is  that  the  convergence  cnteria  of  the  iterative  .solvers  need  not  be  set  to  exceedingly  small 
values  to  obtain  accurate  RCS  results.  Lq  residuals  of  0.01  were  shown  to  have  comparable  results  to 
0.00 1  thereby  saving  considerable  computations. 

Lastly,  it  was  .shown  that  special  treatment  of  corner  singultu'ities  is  required  when  using  a  node-based 
finite  element  foiTnulation. 
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Theta  Angle  (degrees] 

Figure  5 

Monostatic  scattering  from  a  metallic  cube, 
length  of  cube  =  2.46  cm,  freq.  =  10  Ghz,  HH  polarization. 

Comparison  of  measured  data,  node  only  method  and  combined  node/edge  elements. 
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Accuracy  in  Computation  of  Matrix  Elements  of 
Singular  Kernels  * 

Stephen  Wandzura 
Hughes  Research  Labs 


Abstract 

Often  a  principal  limiting  factor  in  the  accuracy  of  moment  method  (MoM)  solutions 
of  field  integral  equations  is  the  technique  of  approximation  used  in  the  representation 
of  near  interactions.  “Near",  in  this  context,  means  interactions  that  are  sensitive  to 
singularities  (typically  logarithms  or  inverse  powers)  in  the  kernel  or  Green  function  at 
vanishing  separation.  The  problem  arises  because  high-order  (e.g.  Gaussian)  numerical 
(luadrature  techniques  are  generally  known  only  for  nonsingular  integrands.  “High- 
order"  means  that  one  can  obtain  extra  digits  of  precision  at  relatively  modest  cost;  in 
practice,  for  sufTiciently  smooth  integrands,  the  quadrature  error  vanishes  exponentially 
in  the  sampling  frequency.  In  the  computation  of  near  matrix  elements,  Gaussian 
ciuadrature  is  often  employed  in  a  way  that  destroys  its  high-order  behavior,  a  practice 
often  called  “singularity  subtraction”.  It  is  somewhat  amusing  that  the  problem  may 
Ik'  attributed  in  part  to  a  semantic  difficulty:  mathematically,  a  “singularity”  is  a 
point  where  a  function  or  any  of  its  derivatives  is  not  defined,  however,  many  consider 
only  “infinities”  to  be  singular.  The  trouble  is  that  Gaussian  quadrature  loses  its  real 
advantage  when  confronted  with  an  integrand  that  is  singular  by  the  mathematical 
definition]  this  is  most  easily  seen  by  reference  to  the  formula  for  the  error  in  Gaussian 
(juadrature. 

The  first  part  of  the  presentation  will  exhibit,  by  simple  numerical  example,  the 
high-order  behavior  of  Gaussian  quadrature,  and  how  it  is  destroyed  by  crude  “sin¬ 
gularity  subtraction”,  even  with  functions  that  appear,  visually,  to  be  extrodiriarily 
smooth.  I  will  then  exhibit  two  (very  different)  methods  by  which  high-order  com¬ 
pulations  with  singular  kernels  can  be  accomplished.  The  first  is  the  application  of 
quadrature  rules  designed  specifically  for  the  type  of  singularity  possessed  by  the  kernel. 

Eor  two  dimensional  scattering  problems,  such  rules,  accurate  for  functions  of  the  form 
f{x)  -1  g{x)  logo,',  where  /  and  g  are  nonsingular,  have  only  recently  been  discovered. 

In  the  case  of  three  dimensions,  an  old  technique  know  as  the  Duffy  transformation  ac¬ 
complishes  the  same  purpose.  This  method  has  the  advantage  that  it  strictly  computes 

’I’his  research  was  supported  by  the  Advanced  Research  Projects  Agency  of  the  Department  of  Defense 
and  wa.s  monitored  by  the  Air  Force  Office  of  Scientific  Research  under  Contract  No.  F49(i2()-91-C-0()fH 
I'he  United  States  Government  is  authorized  to  reproditce  and  distribute  reprints  for  governmenlal  i)urj)oses 
iiotwLt  listatiding  any  copyriglit  notation  hereron. 
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the  matrix  elements  of  the  MoM  with  the  specified  basis  functions,  and  can  be  directly 
applied  to  geometries  with  edges  and  corners.  The  second  high-order  technique  results 
from  a  modification  of  the  kernel  itself,  in  a  way  that  removes  the  singularity,  but 
leaves  convolution  of  the  kernel  with  sufficiently  smooth  functions  unchanged.  This 
method  is  similar  to,  and  was  inspired  by,  the  way  in  which  ultraviolet  divergences  are 
regulated  in  quantum  field  theory.  It  has  the  advantage  that  ordinary  product-type 
Gaussian  quadrature  (of  somewhat  high  order)  can  be  used.  Finally,  f  observe  that 
the  accurate  computation  of  near  interactions  is  not  specific  to  a  MoM  formulation; 
the  same  issues  arise,  and  in  fact  admit  the  same  solution,  in  the  computation  of  the 
“local  corrections”  necessary  to  apply  the  Nystrom  method  to  problems  with  singular 
kernels. 


1  Accuracy  of  Quadrature  for 
Singular  Functions 

1.1  The  Problem 

In  the  computation  of  Galerkin  matrix  elements  of  the  integral  kernel  of,  for  example,  the 
electric  field  integral  equation,  some  (“near”)  elements  involve  integration  over  the  singularity 
of  the  Green  function  at  vanishing  spatial  separation.  For  smooth  2d  (cylindrical)  surface 
seatterers,  the  singular  integration  can  be  isolated  to  a  one  dimensional  integration  of  the 

foim  a. 

z~  drg{r)Yo{kr)  ,  (1) 

Jo 

where  'K  is  a  Bessel  function  of  the  second  kind  and  g{r)  is  a  regular  function  of  the  separation 
r  between  source  and  field  points.  (The  second  kind  Bessel  function  is  the  imaginary  part  of  a 
llcuikel  function,  the  real  part  of  which  is  regular.)  The  Bessel  function  can  be  decomposed|l] 

Yo{x)  =  -  In  (x/2)  Jo(x')  -f  s{x) ,  (2) 

TT 

where  J  is  the  Bessel  function  of  the  first  kind  and  s  is  a  regular  function.  Since,  for  small 
X,  Jo{x)  O  (x'^),  the  function 

V„(3:)  =  Ko(x)--ln(x/2)  (3) 

TT 

is  finite  as  x  — *■  0.  Furthermore,  as  can  be  seen  in  Figure  i,  yo(x),  in  spite  of  being  singular 
in  the  sense  that  it  is  differentiable  only  once  at  x  -  0,  is  quite  smooth  to  the  eye.  Because 
of  tins,  it  is  common  to  compute  ^  =  i  T  2,  where 

z  =  —  f  drg{r)\u{kr/2)  (4) 

7r  Jo 

|or  even  with  g{r)  ^?(0)|  analytically,  with  the  remainder  f  approximated  by  numerical 
quadrature,  'fhis  procedure  is  called  singularity  subtraction.  Its  disadvantage  is  that  it  is  an 
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inherently  low-order  technique,  in  that  high  precision  calculations  are  very  expensive.  As  an 
illustration  of  this,  consider  the  computation  of  a  very  simple  case 

z=  f'  dxYoilOx)  ^  -0.363851  (5) 

Jo 

by  ,V-point  Gaussian  quadrature[l].  Defining  the  precision  as 

p{N)  =  -logio|"^,  ,  (6) 

where  zn  is  the  quadrature  approximation,  we  see  in  Figure  2  that  the  precision  increases 
very  slowly.  (The  largest  value  of  N  ^  16  corresponds  to  the  rule  of  thumb  “ten  points  per 
wavelength”.)  The  behavior  would  be  even  worse  if  a  more  rapidly  varying  factor  (from  a 
curvilinear  surface  or  a  high-order  basis  function)  were  in  the  integrand  of  5.  Often  heard 
statements  by  experienced  practitioners  of  the  method  of  moments  like  “everybody  knows 
that  one  can’t  get  more  than  three  or  four  digits  from  numerical  quadrature”  stem  from  this 
kind  of  behavior. 

1.2  A  Solution  for  2d  Surface  Scattering  Problems 

In  searching  for  a  high-order  method  of  computation  of  ejuantities  like  z,  it  is  helpful  to 
consider  how  one  can  define  A^— point  Gaussian  quadrature.  One  approximates  an  integral 
by  a  weighted  sum: 

f  dx  fix)  ^'^WnfiXn)  (7) 

n=l 

such  that  the  result  is  exact  for  the  2N  functions  |l,x',x^,...  ,x'^''^  The  low  accuiacy 
of  Gaussian  quadrature  is  a  reflection  of  the  fact  that  a  sequence  of  polynomials  converges 
to  Y{kx)  (multiplied  by  any  regular  function)  very  slowly.  This  way  of  thinking  led  me  to 
consider  what  would  happen  if  one  adjusted  the  weights  Wn  and  abscissae  Xn  such  that  the 
quadrature  were  exact  for  a  different  set  of  2N  functions,  namely 

|l,lnx,x,2:lna:, . . .  ,x^~\x^~^  Inxj  . 

I  found  (with  some  difficulty,  as  the  equations  to  be  solved  for  the  x’s  and  m’s  are  quite 
ill  conditioned)  that  the  resulting  rules  are  very  well  behaved;  that  is,  all  the  weights  are 
positive  and  all  the  abscissae  are  within  the  integration  interval.  Generalizations  of  these 
“linlog”  quadrature  rules  are  analyzed  and  tabulated  in  (2j.  When  Af-point  linlog  rules  are 
applied  to  the  2  defined  above,  excellent  convergence  is  obtained,  as  illustrated  in  Hgure  3. 
At  Hughes  Research  Laboratories,  we  have  used  these  quadrature  rules  to  implement  a  very 
high-order  2d  scattering  code. 
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Figure  3:  Digits  of  precision  in  computed  by  ‘‘linlog”  quadrature. 

1.3  Solutions  for  3d  Surface  Scattering  Problems 

It  is  not  easy  to  generalize  the  method  described  above  to  three  dimensional  surface  scattering 
problems.  An  excellent  and  entertaining  discussion  of  the  difficulties  is  given  by  Lyiiess[3]. 
The  basic  trouble  is  that  the  rather  obvious  extension,  construction  of  a  quadrature  rule  that 
is  exact  for  g{r)/r  for  polynomial  g  where  r  is,  for  example,  the  distance  to  a  vertex  of  a 
triangular  patch,  cannot  be  done  in  a  way  that  is  invariant  to  ‘-stretching”  of  the  patch.  This 
means  that  one  would  have  to  compute  weights  and  abscissae  for  each  patch,  an  expensive 
procedure.  (It  is  not  altogether  impractical,  though;  a  good  way  to  do  it  is  discussed  by 
Strain[4j.) 

'I'here  are  a  couple  of  feasible  approaches  that  can  be  described  briefly,  although  they 
can  be  clumsy  in  practice.  Both  involve  changes  of  variables  in  the  integral  (for  example) 

z  /  dx  f  dyf{x,y)/r{x,y),  (8) 

70  Jo 

where  /  is  a  regular  function.  The  first  is  to  use  a  product  rule  on  the  integral  as  it  stands, 
with  Gauss-Legendre  quadrature  for  the  inner  (dy)  integration  and  “linlog”  rule  for  the  outer 
[dx)  integration.  Easier,  perhaps,  is  the  Duffy  transformation [3,  5]: 

2  [  dx  [  dt-f  {x,  tx) ,  (9) 

Jo  Jo  7‘ 

with  Gauss-Legendre  quadrature  used  for  both  integrations;  this  works  because  the  Jacobian 
of  the  transformation  exactly  cancels  the  singularity. 

2  Accuracy  in  Regulation  of 
Singular  Kernels 

An  alternative  to  using  special  quadratures  to  compute  convolutions  with  singular  kernels  is 
the  modification  of  the  kernel  itself  to  truly  remove  the  singularity.  “Jtuly”  means  here;  to 


1174 


render  the  kernel  and  all  its  derivatives  finite.  Short-distance  regulation  of  kernels  is  a  time- 
honored  technique  in  physics;  it  was  the  key  to  solving  the  problem  of  extracting  predictions 
IVom  higher-order  computations  in  quantum  electrodynamics[6,  7J.  (The  regulation  methods 
cited  were,  however,  in  present  terminology,  ’’low-order”  —  the  resultant  kernels,  although 
finite  for  vanishing  separations,  were  still  singular.) 

I'he  question  of  how  to  preserve  accuracy  control  while  regulating  the  kernel  is  illuminated 
by  the  observation  that  achieving  high-order  convergence  by  use  of  special  quadratures  still 
requires  that  the  functions  that  are  convolved  with  the  kernel  be  regular.  If  one  were  to 
introduce  singular  basis  functions,  for  example,  to  model  sources  near  geometric  singularities 
of  the  scatterer,  the  techniques  described  in  the  previous  section  would  need  to  be  modified, 
'fhus  the  key  is  to  require  that  the  regulated  kernel  not  only  converge  to  the  singular  kernel 
as  the  regulation  is  removed,  but  that  it  gives  identical  results  to  the  singular  kernel  when 
convolved  with  a  suitable  class  of  smooth  functions. 

1  will  show  how  this  can  be  done  to  the  kernel  1/  (47rr)  for  the  Laplace  Eciuation  in  three 
dimensions  for  convolution  with  functions  on  smooth  two  dimensional  manifolds  (smooth 
surfaces  of  3d  objects).  The  first  step  is  to  Fourier  transform  to  momentum  space,  giving 
\/q\  We  then  regulate  the  short-distance  (large  q)  with  a  factor  exp-a^q^/4.  Fourier 
transforming  back  to  real  space  gives  the  regulation  prescription 

J _ ^  erf(r/a) 

47rr  47rr 


where  the  regulated  kernel  obviously  is  nonsingular  (because  erf  is  an  odd  function),  ap- 
l)roachcs  the  unregulated  kernel  (for  fixed  r)  as  a  0,  and  has  only  local  corrections.  4’he 
last  projierty  means  that  the  difference  between  the  kernels  vanishes  as  fast  as  a  Gaussian  for 
r  »  a.  Because  of  this,  we  need  not  concern  ourselves  with  kernel  modification  for  sufficiently 
far  interactions,  allowing  compatibility  with  the  Fast  Multipole  Methodl8|. 

The  regulation  so  far  is  low  order,  in  the  sense  that  results  computed  with  the  regulated 
kernel  will  have  regulation  errors  that  are  proportional  to  a  low  power  of  a.  What  we  want 
to  do,  then,  is  to  add  a  smooth  local  function  that  renders  the  regulation  high  order.  A 
fairly  obvious  choice  is  thus 


Gsir) 


erf  (r/a) 
At[T 


(11) 


where  Fb  is  a  i^th  order  polynomial  with  coefficients  adjusted  to  enforce 

dr  [(7ir(r)  -  G{r)]  -  0  ;  m-  =  0, .  (12) 

The  application  of  this  method  to  the  wave  equation  in  an  arbitrary  number  of  dimensions 
is  straightforward.  We  have  implemented  and  verified  it  for  both  ‘2d  and  3d  electromagnetic 
sr;attering  problems. 
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3  Independence  of  Problem  from 
Discretization  Method 

The  question  naturally  arises  whether  the  details  that  have  to  be  treated  in  order  to  achieve 
high-order  discretization  of  integral  equations  with  singular  kernels  are  specific  to  the  method 
of  moments  (MoM),  Can  these  issues  be  circumvented  by  use,  say,  of  a  Nystrom  discretiza¬ 
tion?  The  answer  is  no.  The  best  way  to  apply  the  Nystrom  method  to  such  problems(4]  is 
to  compute  local  corrections  to  quadrature  rules  that  would  be  appropriate  to  nonsingular 
kernels.  In  fact,  one  of  the  questions  that  led  to  the  high-order  regulation  method  above  was 
whether  these  local  corrections  could  be  transferred  from  the  quadrature  rules  to  the  kernel. 
Since  this  can  be  done,  it  may  even  provide  a  more  efficient  method  of  correction  computation 
(in  the  multidimensional  case)  than  the  adaptive  quadrature  rules  use  by  Strain [4]. 


References 

[1]  M.  Abramowitz  and  I.  A.  Stegun,  Handbook  of  Mathematical  Functions,  Applied  Matli- 
ematics  Series,  National  Bureau  of  Standards,  Cambridge,  1972. 

[2]  J.-H.  Ma,  V.  Rokhlin,  and  S.  Wandzura,  “Generalized  gaussian  quadrature  rules  for  sys¬ 
tems  of  arbitrary  functions,”  Technical  Report  YALEU/DCS/RR-990,  Yale  University, 
Department  of  Computer  Science,  October  1993,  To  be  published  in  SIAM  .lournal  of 
Numerical  Analysis. 

[3]  .1,  N.  l.yness,  “On  handling  singularities  in  finite  elements,”  In  T.  O.  Espelid  and  A.  Genz, 
editors,  Numencal  Integration  --  Recent  Developments,  Software  and  Applications,  pages 
219-233,  Kluwer  Academic  Publishers,  Dordrecht,  1992. 

[4]  J.  Strain,  “Locally-corrected  multidimensional  quadrature  rules  for  singular  functions,” 
Teclmica!  report,  Lawrence  Berkeley  Laboratory,  1994,  To  be  published  in  SIAM  Journal 
of  Scientific  Computing. 

[5]  M.  G.  Duffy,  “Quadrature  over  a  pyamid  or  cube  of  integrands  with  a  singularit.y  at  a 
vert(;x,”  Jour  nal  of  Numerical  Analysis,  19:1260-1262,  1982. 

|6|  VV.  Pauli  and  F.  Villars,  “On  the  invariant  regularization  in  relativistic  ciuantum  theory,” 
Reviews  of  Modern  Physics,  21:434-444,  1949. 

'•7\  R.  P.  Feynman,  “Space-time  approach  to  quantum  electrodynamics,”  Physical  Review, 
76:769-789,  1949. 

[8]  R.  Coifrnan,  V.  Rokhlin,  and  S.  Wandzura,  “The  fast  multipole  method:  A  pedestrian 
prescription,”  IEEE  Antennas  and  Propagation  Society  Magazine,  35(3):7-  12,  June  1993. 


1176 


Accuracy  Estimation  and  High  Order  Methods^ 

Lisa  R.  Hamilton,  John  J.  Ottusch*,  Mark  A.  Stalzer, 

R.  Steven  Turley,  John  L.  Visher,  and  Stephen  M.  Wandzura 

Hughes  Research  Laboratories 
3011  Malibu  Canyon  Road,  Malibu,  CA  90265 


Abstract 

Accuracy  estimates  are  a  prerequisite  for  meaningful  comparisons  of  calculated  radar  cross  sec¬ 
tions  (RCS),  especially  performance  comparisons.  RCS  codes  incorporating  high  order  methods 
have  advantages  over  low  order  codes  both  for  achieving  accurate  answers  and  for  estimating 
solution  accuracy.  In  this  paper,  we  describe  high  order  convergence  and  some  high  order 
methods  used  in  RCS  computations;  we  describe  a  straightforward  method  for  estimating  calcu¬ 
lated  RCS  accuracy;  and  we  discuss  the  tradeoffs  between  solution  accuracy  and  cost  (in  terms 
of  cpu  time  and  memory)  for  high  and  low  order  RCS  codes. 

1.  introduction 

In  our  experience  at  conferences  dealing  with  radar  scattering  calculations,  when  the  accuracy  of  a  given  RCS 
calculation  is  described  at  ail,  it  is  described  in  an  imprecise,  qualitative  way  that  can  be  paraphrased  as  fol¬ 
lows*  “When  my  calculation  and  so-and-so’s  calculation  (or  so-and-so’s  experimental  results)  are  plotted  on 
the  same  scale,  they  appear  to  be  similar.”  Similarities  are  nice  indeed,  but  without  quantification  of  the  esti¬ 
mated  accuracies  of  the  various  results,  whether  calculational  or  experimental,  there  is  no  way  to  determine 
which  one  should  be  given  the  most  credence  or  what  may  be  the  underlying  cause  of  observed  differences. 
In  addition,  when  it  comes  time  to  compare  the  efficiency  of  different  codes  in  computing  a  particular  RCS.  the 
computed  results  must  be  of  comparable  accuracy  for  the  comparison  to  have  any  meaning. 

Differences  between  a  calculated  RCS  and  the  actual  RCS  for  a  given  body  result  from  1)  deficiencies  in  the 
abstract  model  of  the  scatterer  and  2)  deficiencies  in  the  way  the  RCS  of  the  abstract  model  is  calculated.  The 
former  category  is  modeling  error.  It  includes  such  simplifications  as  modeling  a  surface  as  a  perfect  conduc¬ 
tor  treating  thin  body  parls  as  though  they  actually  had  zero  thickness,  etc.  The  latter  category  is  solution  er¬ 
ror’.  When  two  codes  agree  on  an  abstract  model,  differences  in  their  calculated  RCS’s  are  a  reflection  of  the 
different  ways  they  arrive  at  approximate  numerical  solutions  to  Maxwell’s  equations  for  the  boundary  condi¬ 
tions  appropriate  to  that  model  of  the  scatterer. 

The  attitude  of  many  workers  in  the  field  seems  to  be  that  there  is  no  way  to  estimate  the  accuracy  of  a  com¬ 
puted  RCS  for  a  given  model,  except  in  a  qualitative  way  by  comparing  it  to  other  calculated  RCS’s  for  the 
same  model.  For  RCS  codes  that  use  the  method-of-moments  approximation,  which,  in  principal,  can  gener¬ 
ate  solutions  with  arbitrary  accuracy,  this  is  not  the  case.  In  fact,  there  are  straightforward  ways  to  estimate  the 
accuracy  of  an  RCS  calculation  even  when  the  correct  answer  is  unknown.  One  such  method  is  described  be¬ 
low. 

Another  reason  that  calculated  RCS’s  are  infrequently  accompanied  by  accuracy  estimates  is  a  widespread 
underappreciation  of  the  tradeoff  between  the  cost  of  a  calculation  (in  terms  of  CPU  time  and  memory)  and  the 
accuracy  of  the  resultant  solution.  Accurate  solutions  require  more  computer  resources  than  inaccurate  ones. 
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ute  reprints  lor  governmental  purposes  notwithstanding  any  copyright  notation  hereon. 


1 177 


This,  coupled  with  the  fact  that  the  cost  of  a  calculation  increases  as  a  moderate  power  of  the  size  of  the  prob¬ 
lem,  means  that  as  problem  sizes  grow,  the  resource  requirements  for  obtaining  acceptably  accurate  results 
using  standard,  low  order  RCS  codes  grow  ever  more  quickly  and  can  easily  become  exorbitant. 

incorporation  of  high  order  methods  in  an  RCS  code  can  improve  the  situation  dramatically.  High  order  nu¬ 
merical  methods  are  distinguished  by  their  ability  to  rapidly  converge  to  the  correct  answer.  A  well-known  ex¬ 
ample  is  Gaussian  quadrature,  the  name  of  a  class  of  high  order  methods  used  for  numerically  evaluating  inte¬ 
grals.  Numerical  methods  can  be  classified  as  high  order  only  with  respect  to  a  certain  class  of  problems.  For 
problems  within  this  class  they  converge  rapidly  to  the  correct  answer.  The  rate  of  convergence  is  determined 
by  order  of  the  method,  p,  and  it  increases  as  the  order  of  the  method  increases.  For  problems  outside  this 
class  they  converge  to  the  correct  answer,  albeit  at  a  slower  rate.  Rapid  convergence  to  an  incorrect  answer 
almost  always  results  from  an  implementation  error.  Intelligently  applying  high  order  methods  in  a  method-of- 
moments  RCS  code  can  reduce  the  cost  of  calculating  an  acceptably  accurate  RCS  to  the  extent  that  an  oth¬ 
erwise  impractically  large  problem  may  be  done  on  a  supercomputer  or  even  a  workstation  and  in  less  time. 

The  principal  advantage  of  high  order  methods  is  that  they  reduce  the  power  by  which  the  cost  grows  as  a 
function  of  accuracy  for  fixed  problem  size.  Consequently,  cost  is  a  less  sensitive  function  of  accuracy  for  high 
order  codes  than  it  is  for  low  order  codes.  Therefore,  when  comparing  the  performance  of  different  codes,  it  is 
even  more  important  to  have  a  good  estimate  of  the  accuracy  of  a  result  computed  by  a  low  order  code  than  it 
is  for  a  result  computed  by  a  high  order  code. 

The  mathematical  basis  of  these  arguments  is  as  follows.  One  can  express  the  solution  to  Maxwell's  equations 
for  EM  scattering  from  a  surface  as  an  integral  equation,  whose  solution  can  be  computed  numerically  by  dis¬ 
cretizing  the  surface  into  separate  patches  and  expressing  the  surface  current  on  each  patch  as  a  linear  com¬ 
bination  of  elemental  currents.  The  number  N  of  independent  elemental  surface  currents  whose  coefficients 
must  be  determined  is  proportional  to  (A  /  A),  where  A  is  the  surface  area  of  the  scatterer  and  A  is  a  discretiza¬ 
tion  scale.  The  asymptotic  cost  of  computing  these  coefficients  increases  as  N“.  The  exponent  a  is  3  for  cpu 
time  and  2  for  memory  for  a  direct  /  dense  solver  and  is  <  2.5  for  cpu  time  and  <1.5  for  memory  for  a  single 
stage  fast  multipole  method[1]  iterative  solver.  In  general,  the  process  of  discretizing  an  integral  equation  for 
purposes  of  numerical  solution  introduces  errors  in  the  computed  result  that  scale  as  A^  where  the  exponent  p 
is  the  order  of  the  numerical  method  used  to  solve  the  discretized  equation.  From  these  relations  one  can  de¬ 
duce  that  cost  oc  (1  /  error)  P;  i.e.  cost  scales  as  (1  /  error)  to  a  certain  power,  but  that  power  decreases  in 
proportion  to  the  order  of  the  method. 

This  doesn’t  quite  tell  the  whole  story  because  we  haven’t  described  the  coefficient  of  proportionality,  which 
also  depends  on  the  order  of  the  method.  Calculations  of  impedance  matrix  elements  using  high  order  meth¬ 
ods  are  more  complicated,  so  they  generally  require  more  code  and  more  time  than  a  low  order  calculation 
does.  The  time  requirement,  in  particular,  generally  grows  as  the  order  of  the  method  increases.  The  effect 
this  has  on  the  total  cost  of  a  large  calculation  is  small,  however,  when  compared  to  the  dramatic  effect  high 
order  methods  have  on  the  rate  at  which  the  time  and  memory  requirements  grow  with  increasing  accuracy. 
Experience  has  shown  that  low  order  methods  are  more  efficient  for  relatively  small  problems  and  low  accura¬ 
cies,  but  for  large  problems  or  high  accuracies,  high  order  codes  have  a  distinct,  and  ever-increasing  advan¬ 
tage. 

Let  us  now  return  to  the  topic  of  accuracy  estimation.  As  A  gets  smaller  and  smaller,  numerical  solutions  to 
Maxwell’s  equations  calculated  in  the  manner  described  previously  will  converge  to  the  correct  answer.'  To  put 
it  another  way,  if  a  series  of  RCS  computations  were  performed  in  which  the  discretization  scale  A  decreased 
geometrically,  the  differences  between  the  calculated  RCS’s  and  the  exact  RCS  would  decrease  as  a  function 
of  A  (for  sufficiently  small  values  of  A).  Suppose  the  series  starts  with  Aq  and  ends  with  An.  The  most  accurate 
of  the  calculations  would  be  the  one  corresponding  to  the  finest  discretization  scale,  A^,.  The  accuracy  of  this 
solution  can  be  estimated  by  using  it  as  a  stand-in  for  the  exact  solution  and  observing  how  fast  the  other  solu¬ 
tions  converge  to  it.  If  they  converge  monotonically,  the  error  in  the  An  solution  can  be  conservatively  esti- 


This  holds  true  in  the  absence  of  errors  implementing  the  method-of-moments  algorithm  and  in  the  regime  where  round-ofi  errors  are  insignifi¬ 
cant. 
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mated  to  be  of  the  order  of  the  difference  between  the  An  and  An-i  solutions.  The  essential  point  here  is  that 
one  RCS  calculation  is  insufficient  to  provide  any  information  about  accuracy;  in  general,  a  series  of  calcula¬ 
tions  is  required  to  corroborate  convergence  to  a  unique  answer,  and  from  the  convergence  rate  to  derive  an 
accuracy  estimate. 


2.  Discussion 

In  the  rest  of  this  paper,  we  will  discuss  some  example  problems  that  demonstrate  the  differences  between 
high  and  low  order  codes,  the  tradeoffs  between  cost  and  accuracy  in  RCS  computations,  and  how  accuracy 
estimation  is  affected  by  method  order.  The  first  example,  involving  vector  EM  scattering  from  a  U-radius 
sphere,  is  a  graphic  demonstration  of  what  we  mean  by  high  order  convergence  of  RCS  results.  The  second 
example,  involving  scalar  scattering  from  a  U-radius  circle  in  2D,  compares  how  the  order  of  one’s  method(s) 
affects  the  rate  at  which  a  computed  RCS  converges  to  the  correct  answer.  The  third  example,  involving  sca¬ 
lar  scattering  from  a  2D  object  we  call  the  "bat”,  demonstrates  that  high  order  methods  have  the  upper  hand 
where  it  really  counts,  namely  on  large  problems  in  which  the  RCS  has  a  large  dynamic  range  as  a  function  of 
angle  and  the  computation  is  of  little  use  unless  it  is  accurate  where  the  cross  section  is  low. 

The  costs  of  the  various  calculations  will  be  measured  in  terms  of  the  number  of  unknowns,  which  for  our  pur¬ 
poses  is  equivalent  to  cpu  memory  since  we  used  a  dense  /  direct  solver  in  all  cases.  We  decided  to  focus  on 
this  one  cost  measure  for  simplicity  and  brevity  of  presentation. 

2.1  Sphere 

A  good  way  to  illustrate  the  difference  between  high  and  low  order  convergence  is  to  compare  how  efficiently 
an  RCS  code  using  high  order  methods  and  one  using  low  order  methods  compute  cross  sections  for  the  same 
geometry  For  the  comparison  here,  we  have  chosen  two  method-of-moments  codes,  FastScat^  and  CARLOS- 
3D  (date  of  release.  Sept.  1992).  CARLOS-3DI2]  is  the  EMCC-distributed  standard  for  computing  vector  EM 
scattering  in  3  dimensions.  It  can  be  classified  as  a  low  order  code  because  it  is  limited  to  using  standard  Rao- 
Wilton-Glisson  {RWG)[3]  current  basis  functions  (CBF)  and  a  surface  representation  consisting  of  flat  patches, 
its  quadratures  employ  a  mixture  of  high  and  low  order  methods.*  FastScat  is  an  RCS  code  being  developed 
at  Hughes  Research  Laboratories  that  employs  high  order  methods®  in  its  current  basis  functions[4] 
(generalized  RWG  CBF’s),  quadratures[5),  and  geometry  descriptionlB]  (exact  surfaces). 

We  computed  the  monostatic  RCS  of  a  U-radius  sphere  with  both  FastScat  and  CARLOS-3D.  In  both  cases, 
the  sphere  was  patched  by  approximately  identical  triangular  patches,  starting  from  an  inscribed  icosahedron. 
In  the  FastScat  case,  each  patch  was  mapped  exactly  to  the  surface  of  the  sphere.  The  patches  were  not  fur¬ 
ther  subdivided.  Accuracy  improvements  were  achieved  by  invoking  progressively  higher-order  CBF’s.  In  the 
CARLOS-3D  case,  only  the  lowest  CBF  order  exists.  To  increase  the  accuracy  of  the  computed  RCS,  we  in¬ 
creased  the  density  of  unknowns  by  subdividing  the  patches  into  progressively  smaller  (nearly)  identical  trian¬ 
gular  patches,  each  of  whose  vertices  were  on  the  surface  of  the  sphere. 

The  cross  section  of  a  perfectly-conducting  sphere  can  be  determined  to  arbitrary  precision  by  numerically 
summing  the  Mie  series.  For  a  U-radius  sphere  the  monostatic  00-pol  RCS  is  5.03176  dB  ,  independent  of 
observation  angle.  Calculated  RCS’s  do  vary  with  angle,  however,  due  to  current  discretization,  quadrature 
errors,  and  in  the  case  of  CARLOS-3D,  inaccurate  realization  of  the  surface  using  flat  patches.  Figure  1 
shows  two  plots  of  RCS  vs.  latitude  on  the  sphere,  one  for  FastScat  and  one  for  CARLOS-3D.  In  each  plot  the 
calculated  RCS  is  shown  for  about  275,  500,  750,  and  1000  unknowns  as  indicated  in  the  legend.  In  the 
FastScat  case  these  curves  correspond  to  CBF  orders  of  5,  7,  9,  and  10;  in  the  CARLOS-3D  case,  they  corre¬ 
spond  to  subdividing  the  inscribed  icosahedron  into  180,  320, 500,  and  720  nearly  identical  triangles.  The  least 
accurate  computations  are  about  equally  bad  in  the  two  cases.  As  the  surface  density  of  unknowns  increases. 


t  FastScat  is  a  trademark  ot  the  Hughes  Aircraft  Company. 

t  CARLOS-3D's  quadratures  are  high  order  for  non-touching,  flat  patches;  for  touchihg  or  same  patch  quadratures.  CARLOS-3D  uses  a  singu¬ 
larity  subtraction  technique  that  is  inherently  low  order. 

*  FastScat’s  methods  are  high  order  for  scatterers  comprised  of  smooth  surfaces,  whether  flat  or  curved. 

"  dB  stands  for  dB-X^  an  engineering  unit  of  RCS  measurement. 
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FastScat 


CARL0S-3D 


Figure  1 


however,  it  is  easy  to  see  that  the  FastScat  solutions  converge  to  the  correct  answer  more  rapidly.  This  is  the 
essence  of  high  order  convergence. 


2.2  Circle 

Next  we  consider  the  bistatic  RCS  for  TM  scattering  from  a  perfectly-conducting  1?t-radius  circle.  It  is  a  simple, 
but  useful  candidate  for  demonstrating  how  different  aspects  of  the  computation  as  well  as  the  order  of  their 
numerical  methods  affect  the  rate  of  convergence  to  the  correct  answer.  Like  the  sphere,  there  are  Mie  series 
solutions  for  the  cross  section,  so  there  can  be  no  dispute  about  what  the  correct  answer  is.  However,  for  our 
purposes  it  is  useful  to  characterize  the  differences  between  a  given  calculated  solution  and  the  correct  solu¬ 
tion  over  a  range  of  angles  as  a  single  number,  and  there  is  some  dispute  about  the  best  way  to  do  this.  Our 
standard  error  measures  are  maximum  relative  error,  maximum  error  4-  average  RCS,  and  rms  error.^^  it  must 
be  emphasized  that  for  this  problem  the  results  are  essentially  independent  of  which  error  measure  is  used. 

As  we  stated  earlier,  FastScat  can  employ  high  order  methods  in  its  current  basis  functions,  quadratures,  and 
surface  representation.  The  user  can  control  the  order  of  the  method  used  for  each  aspect  of  the  computation. 
In  general,  the  convergence  rate  of  the  calculation  as  a  whole  is  determined  by  the  lowest  method  order  of  the 
three.  It’s  analogous  to  the  strength  of  a  chain  being  determined  by  its  weakest  link.  To  show  how  the  method 


These  error  measures  are  defined  as  follows: 

Maximum  relative  error  =  max(lo(9)  /  OoMctfS)  - 1 1) 

Maximum  error  .f  average  RCS  =  max(la(0)  -  O9iac((0)l)  /  ave(o8xact(0)) 
RMS  error  3  rms(a(0)  /  cre«a(0)  ■  1 ) 
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RCS  convergence 
fora  1X-radius  circle 


#  Unknowns 
Figure  2 

order  of  an  individual  component  affects  the  rate  of  convergence  of  the  full  solution,  we  need  to  set  the  method 
order  for  each  of  the  other  two  components  high  enough  so  that  they  don’t  contribute  any  noticeable  error. 
Once  this  has  been  done,  we  can  measure  how  quickly  the  solution  converges  as  we  vary  A  by  making  the 
patches  smaller.  For  example,  to  investigate  the  effect  of  current  basis  function  order  (up  to  a  high  precision 
such  as  10’’°),  we  could  set  the  quadrature  order  high  enough  to  guarantee  that  the  integrals  are  accurate  to  at 
least  1  part  in  and  use  an  exact  surface  representation.  Then,  for  a  fixed  CBF  order,  we  would  compute 
the  RCS  for  a  series  of  U-radius  circles  divided  into  4,  8, 16,  32,  and  so  on,  identical  circular  arcs.  We  have 
plotted  the  errors  for  such  a  series  of  calculations  in  Figure  2  for  CBF  orders  0,  1,  2,  and  3.^*  On  the  same 
plot,  we  also  show  an  example  of  how  the  surface  model  affects  the  convergence  rate.  The  dashed  curve 
connects  points  that  were  computed  by  replacing  the  circular  arc  patches  with  flat  patches.®^  The  order  of  the 
quadratures  was  the  same  as  in  the  previous  case.  For  this  case,  however,  only  one  CBF  order  is  shown, 
namely  zero.  The  reason  is  that  the  poor  surface  representation  so  limits  the  rate  of  convergence  that  increas¬ 
ing  the  CBF  order  has  no  noticeable  effect  on  the  accuracy  of  the  solution.  Curves  for  higher  CBF  orders  are 
virtual  copies  of  the  zeroth-order  CBF  result,  shifted  to  higher  numbers  of  unknowns. 

The  most  important  feature  to  note  is  that,  for  enough  unknowns,  the  data  fits  a  linear  trend  line  whose  slope 
increase  as  the  CBF  order  increases.  Since  the  discretization  scale,  A,  is  inversely  proportional  to  the  number 
of  unknowns,  N,  this  simply  reflects  the  fact  that  the  error  diminishes  as  A^  where  x  increases  with  method  or¬ 
der.  And,  since  memory  is  proportional  to  N®,  it  also  shows  how  method  order  affects  the  relationship  between 
accuracy  and  cost.  For  errors  less  than  about  10  ^  not  only  are  the  errors  in  the  RCS's  calculated  by  higher 
order  methods  lower,  but  also  the  marginal  cost  of  additional  accuracy  is  lower.  This  makes  assessment  of  the 
resource  requirements  corresponding  to  a  given  solution  accuracy  more  reliable,  which  is  the  performance 
comparison  benefit  mentioned  earlier. 

Not  many  practical  problems  require  that  the  RCS  be  calculated  to  anywhere  near  a  precision  of  1  part  in  10’°. 
For  example,  if  1  part  in  10  were  deemed  a  “practical”  accuracy  requirement,  then  one  can  see  from  Figure  2 
that  using  flat  patches  and  zeroth  order  CBF’s  is  about  as  efficient  as  using  an  exact  surface  and  higher  order 
CBF’s.  So,  do  these  results  have  any  relevance  to  more  practical  problems?  The  answer  is  yes.  A  smalt  cir¬ 
cle  was  chosen  for  reasons  of  expediency  (particularly  with  regard  to  the  low  order  calculations)  and  because  it 


“  For  TM  scattering  in  2D,  the  n'”  order  current  basis  function  is  essentially  an  n"”  order  Legendre  polynomial. 
This  is  how  RAM-2D  works,  for  example. 
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is  a  geometry  for  which  the  relationship  between  method  order  and  convergence  rate  can  be  demonstrated  so 
clearly  and  easily.  "  Truly  interesting  practical  problems  are  generally  much  bigger.  When  the  problems  get 
bigger,  the  RCS  as  a  function  of  angle  often  has  a  large  dynamic  range.  It  is  in  these  situations  that  the  advan¬ 
tages  of  high  order  methods  become  evident  even  at  “practical"  accuracies.  This  leads  us  to  the  final  example 
problem. 


2.3  Bat 

Consider  the  fictitious  2D  geometry  shown  in  Figure  3  which  we  call  the  “bat”.  It  is  composed  of  straight  faces 
connected  smoothly  by  circular  arcs  of  radius  R.  There  are  two  long  edges  of  length  L  and  six  short  edges, 
each  of  length  L  /  3,  at  right  angles  to  each  other.  All  surfaces  are  perfect  conductors.  It  is  interesting  from  a 
practical  point  of  view  because  it  has  three  high  RCS  specular  reflection  regions  (one  of  which  is  the  2D  analog 
of  a  corner  cube)  and  a  low  RCS  everywhere  else. 


Let  us  focus  on  the  monostatic  RCS  for  TM  scattering  from  a  R  =  1X.,  L  =  300X  bat.  We  don’t  have  an  exact 
solution  to  this  problem  so  we  followed  the  prescription  described  earlier  to  ascertain  accuracy.  A  series  of 
FastScat  computations  was  performed  with  increasingly  fine  discretizations  in  order  to  look  for  convergence  to 
a  unique  answer.  For  these  calculations,  we  used  an  exact  surface  representation  and  quadratures  good  to  at 
least  8  digits  of  accuracy.  The  discretization  scale  size  was  changed  by  changing  the  CBF  order  while  keeping 
the  size  of  the  patches  fixed  at  about  one  wavelength  per  patch.  The  result  shown  in  Figure  4a  used  4’^-order 
DBF's  and  required  6000  unknowns.  It  is  estimated  to  have  an  accuracy  comparable  to  the  width  of  the  plotted 
line.  There  are  narrow  peaks  at  45®  and  135®  as  expected  and  a  broader  peak  centered  at  180°  resulting  from 
the  “corner  square”  effect.  The  oscillations  evident  in  the  cross  section  are  the  result  of  interference,  not  due  to 
any  solution  error. 

For  comparison  purposes,  we  also  calculated  the  RCS  of  the  bat  in  the  standard  low  order  way  using  zeroth- 
order  DBF’s.  By  increasing  the  patching  density  to  about  five  patches  per  wavelength  we  kept  the  number  of 
unknowns  constant  at  6000.  This  result  is  shown  in  Figure  4b.  Whereas  this  calculation  generally  agrees  with 
the  high  order  calculation  at  angles  where  the  cross  section  is  high,  in  the  low  cross  section  regions  the  calcu¬ 
lated  RCS  is  clearly  erroneous,  the  error  exceeding  20dB  at  some  angles. 


Note:  we  have  also  observed  similar  behavior  for  scalar  and  vector  EM  scattering  from  U-radius  spheres. 
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100000 


Bat  PCS 

(TM  Polarization) 


This  problem  dramatically  illustrates  the  limitations  of  using  low  order  methods  for  RCS  computations.  For 
someone  interested  in  the  low  cross  section  regions  of  this  body,  the  results  of  the  low  order  calculation  would 
be  particularly  misleading.  If  one  were  interested  in  improving  (or  for  that  matter,  estimating)  the  accuracy  of 
such  a  calculation,  one  could  make  the  patches  smaller  still,  knowing  that  the  low  order  calculation  would 
eventually  reach  the  same  accuracy  as  the  high-order  calculation.  However,  the  additional  cost  would  be  sig¬ 
nificant. 


3.  Summary 

Using  high  order  methods  to  solve  RCS  problems  pays  significant  dividends,  particularly  on  large  problems. 
The  benefits  that  accrue  are  1)  lower  cpu  time  and  memory  required  to  achieve  a  given  accuracy,  2)  lower 
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marginal  cost  for  achieving  additional  digits  of  accuracy  {which  is  a  big  advantage  for  accuracy  estimation), 
and  its  corollary,  3)  reduced  sensitivity  of  the  cost  to  any  imprecision  in  the  estimate  of  the  solution  accuracy 
(which  makes  it  easier  to  make  meaning  performance  comparisons). 
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ACCURACY  ISSUES  IN  TIME-DOMAIN  CEM  USING 
STRUCTURED/UNSTRUCTURED  GRID  FORMULATIONS 
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Rockwell  Science  Center 


Abstract 

A  Computational  Electromagnetics  (CEM)  capability  for  solving  scattering,  radiation, 
and  other  problems  of  interest  such  as  microstrip  circuit  analj'-sis,  bioelectromagnetics  and 
EMP/EMI  has  been  developed  by  numerically  solving  the  time-domain  Maxwell’s  equa¬ 
tions  employing  some  of  the  algorithmic  rigors  of  Computational  Fluid  Dynamics  (CFD). 
The  differential  conservation  form  of  Maxwell’s  equations  is  used  for  the  structured  grid 
option  and  the  integral  conservation  form  for  the  unstructured  grid  formulation.  Applica¬ 
tion  of  the  CEM  code  for  computing  the  radar  cross  section  (RCS)  of  a  complete  fighter 
involving  tens  of  millions  of  grid  points  on  a  massively  parallel  architecture  is  demonstrated 
along  with  a  number  of  other  applications  including  the  propagation  of  microwave  energy 
through  a  complete  human  body. 

Introduction 

A  structured  grid  and  an  unstructured  grid -based  finite-volume,  time-domain 
Maxwell’s  equation  solver  has  been  developed  incorporating  modeling  techniciues  for  gen¬ 
eral  radar  absorbing  materials.  Using  this  work  as  a  base,  the  goal  of  the  CEM  efiort  is 
to  define,  implement,  and  evaluate  rapid  prototype  signature  prediction,  addressing  many 
issues  related  to  1)  physics  of  electromagnetics,  2)  efficient  and  higher-order  accurate  al¬ 
gorithms,  3)  boundary  condition  procedures,  4)  geometry  and  gridding  (structured  and 
unstructured),  5)  computer  architecture  (SIMD  and  MMD),  and  C)  validation. 

Some  of  the  accuracy  issues  associated  with  the  time- domain  CEM  formulation  are; 

1)  Stability  and  accuracy  of  finite- volume  schemes  applied  to  the  discretized  Maxwell’s 
equations  using  von  Neumann  analysis 

2)  Demonstration  of  accuracy  for  various  1-D  problems  such  as  the  wave  propagation 
through  material  interfaces  and  the  Salisbury  screen,  2-D  problems  such  as  the  per¬ 
fectly  conducting,  clad  and  dielectric  circular  cylinders  (comparison  with  Mie  series), 
and  3-D  problems  such  as  the  perfectly  ccmductiong  and  coated  spheres  (comparison 
with  Mie  series). 

3)  Study  on  how  many  grid  cells  per  wa,velenth,  accuracy  of  outer  boundary  condition 
treatment  and  convergence  criteria,  for  CW  and  pulse  cases 

4)  Resource  requirements  to  achieve  (say  within  1  dB)  accurate  solutions  for  general  3-D 
problems 


1)  Maxwell’s  Equations 

In  order  to  apply  conservation  primaples  (for  example,  in  fluid  dynamics  mass,  momen¬ 
tum,  and  energy  are  conserved),  many  of  the  governing  equations  rei)resenting  appropriate 
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plwsical  processes  are  written  in  conservation  form.  The  general  form  of  a  differential 
conservation  ecpuition  can  Ik*  written  as 


Qt  +  Ej-  +  Fy  +  G;  —  Source  (1) 

where  Q  is  the  solution  vector  and  E,  F,  and  G  arc  the  fluxes  in  :r,  ?/,  and  .r  coordinate 
directions,  respectively.  The  conservation  form  readily  admits  weak  solutions  such  as  shock 
waves. 

The  integral  form  of  the  conservation  laws  which  can  easily  be  derived  from  the  differ¬ 
ential  form  by  integrating  Eq.  (1)  with  respect  to  x,y,z  over  any  cfuiservation  cell  whose 
volume  is  V. 

f  f  f  (dQ  dE  dF  dG\  ,  ^  , 

,21 

=  /  /  /  Sdrdydz  =  S  . 


This  can  be  rewritten  in  vector  notation  as 


Q  dx  dy  dz 


V  ■T]  dx  d.y  dz  ^  S 


In  the  abox'e, 


E=Ej+  Fk  +  GI 


Applying  the  Gauss  divergence  theorem,  we  can  convert  the  volume  integral  into  a  surface 
integral. 

4GG+  f  f  G-n)rk  =  S  •  (5) 


Ill  the  above  equation,  the  cell  average  of  the  dependent  variables  are  denoted  by  Q.  The 
outward  unit  normal  at  any  point  of  the  boundary  surface  of  a  cell  has  been  denoted  by 

Q  _  LUy  Q!E  ,e, 

^  (IL-dv  ■ 


The  integral  form  of  the  conservation  laws  given  by  Eq.  (5)  defines  a  system  of  ecpiations 
for  the  cell  average  values  of  the  dependent  variables. 

Maxwelhs  equations  in  their  vector  form  are 

^  =  -VxG  (7) 


=  V  X  -  .7 
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The  divergence  conditions  'V  D  —  p  and  S/ -B  —  0  arc  derived  directly  from  Maxwell  s  equa¬ 
tions,  where  V  •  J  =  The  vector  quantities  S  =  [Ej-,Ey,£z )  and  H  =  (Hx,  ?dr)  ^re 

the  electric  and  magnetic  field  intensities,  D  =  {D^,Dy,Dz)  is  the  electric  displacement, 
B  ^  {Br,By,Bz)  is  the  magnetic  induction,  and  J  =  is  the  current  density 

and  p  is  the  charge  density.  The  subscripts  x.xj.z  in  the  vector  representation  of  <f,  W,  B, 
and  D  refer  to  components  in  respective  directions. 

Under  the  transformation  of  coordinates  implied  by 

T  i{t,x,y,z), 

V  =  >  C  = 

Eqs.  (7)  and  (8)  can  be  rewritten  as 

Q, +  £^  +  F,, +  Gc  =  S  (9) 


where  J  =  |5(^,  ??,  C)/5(-U  y. -)1  i«  Jacobian  of  the  transformation  and,  e.g.,  ^  - 
[d,^,dyCdzO-  The  quantities  (x  H  and  <f  x  E  in  Eq.  (11)  re_pr^ent  tan_gcntial  mag¬ 
netic  and  electric  fields  at  a  constant  ^  surface.  Thus,  the  fluxes  E,  E,  and  G  arc  nothing 
but  the  tangential  fields. 

Maxwell’s  equations  can  also  be  cast  in  integral  form  as 


dt 


u  X  E 
-h  X  H 


dS  =  0 


(11) 


where  the  six  components  of  T  •  n  in  Eq.  (5)  are  {n  x  E,  — 7r  x  H). 

In  general,  the  differential  form,  Eq.  (10),  will  be  employed  for  finite-volume  schemes 
using  a  structured  grid  arrangement,  and  the  integral  form,  Eq.  (11),  will  be  used  for 
unstructured  grid  cell  arrangements  using  finite-  element-like  finite-volume  schemes. 

2.0  Finite— Volume  Treatment 

The  major  feature  of  the  present  discretizcition  approach  that  distinguishes  it  from 

other  finite . volume  and  finite-difference  procedures  is  that  the  electric  and  magnetic  field 

unknowns  are  co-located  in  both  space  and  time,  rather  than  being  assigned  to  two  inter¬ 
penetrating  spatial  grids  and  separated  a  half-step  in  time.  These  field  unknowns  are  the 
volume  averages  of  E  anfl  H  within  each  cell  in  the  space— filling  grid. 
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An  algoiirliin  that  maintains  second- order  accnracy  in  ]>otli  space  and  time  can  be 
constructed  as  follows  (advancing  from  time  level  ni  to  in  +  1): 


Jdci 

/C  =  -^  /  =~  [  i,  (f,  X  {n  X  [(0)™  -  o:'"]  })  dS 

'  a  Jocy  Jdrr 

QT''-  (C)  =  (0)"'  +  ''"  +  (r  -  ^c)  •  K  for  f  in  cell  o 

(0);:'+’  =  (qy: f  •  f ds  . 

.Ida  ^  ' 

Here  we  liave  written  Maxwell’s  equations  symbolically  as 

^  +  v-F(0)  =  o  ,  Q={3,b)  , 


and  the  solution  of  the  Riemaim  problem  gives  the  interface  flux. 

For  Maxwell's  equations,  the  different  values  of  ii  x  E  and  ii  x  H  on  the  two  sides  of 
an  interface  mix  together  in  characteristic  combinations  to  form  the  numerical  interface 
fluxes  h  X  E*  and  h  x  H* .  For  two  cells  with  different  e  and  /q  these  fluxes  take  the  form 


n  X  E*  =  n  X 
n  X  H*  =nx 


{  [F;+(ec)+  +n.x  ff+]  +  [£-(ec)-  -  n  x  g"]  } 
(ec)-  + (ec)+ 

{  [i/  +  (//c)+  -nx  F+]  +  +  n  X  E~] } 

ific)-  +  (pc)+ 


where  the  noriiial  points  from  the  (-)  cell  into  the  (  +  ),  and  c  —  is  the  si^eed  of  light 
inside  each  cell. 

In  order  to  understand  the  stability  and  accuracy,  we  apply  this  scheme  to  the  simple 
scalar  advectiori  equation 


Of 


du  ^ 
+  c—  =  0. 
dx 


Advancing  the  solution  u  from  time  level  n  to  r?.  +  1,  the  fully  discrete  operator  takes  the 

X.  3a;'  --  4n,;’  ,  +  u'L,  n,"  -  2n;'_.  +  ^ 

f'  +  i  —  I  r<r'r( _ I _ ^  -  |  CFL{— _ — _ 


-n;  +  CTL(- 


^)) 


where  CFL  —  C“.  This  oi^erator  is  second  order  accurate  in  space  and  time,  and  can 
operate  in  a  stable  fashion  up  to  a  CFL  number  of  2.  It  has  perfect  sliift  at  CFL  of  one 
and  two.  The  sj)ectral  characteristics  for  the  phase  and  amplitude  variations  are  shown  in 
Figures  la  and  lb. 
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3)  Geometry/Gridding 

Two  gridding  issues  that  need  to  be  addressed  in  EM  computations  are:  1)  num¬ 
ber  of  grid  points  per  wavelength  to  properly  represent  the  fields  in  and  around  a  scat- 
terer;  and  2)  how  far  should  the  outer  boundary  be  placed  from  the  scattering  object 
to  adecpiately  simulate  the  nonrefiecting  boundary  condition.  In  general,  the  number  of 
points/wavelength  is  not  determined  by  wavelength  alone,  and  involves  the  body  dimen¬ 
sions  (characteristic  body  size  with  respect  to  wavelength)  also.  The  outer  boundary 
location,  theoretically,  can  be  right  on  the  body  surface  itself;  however,  the  computational 
implementation  of  nonrefiecting  boundary  conditions  requires  the  outer  boundary  at  a  few 
(2  to  5)  wavelengths  away  from  the  surface.  Again,  if  one  can  construct  higher  oidei  ac¬ 
curate  implementations  of  nonreflecting  boundary  conditions,  the  outer  boundary  can  be 
brought  very  close  to  the  scattering  surface.  In  general,  the  necessary  grid  resolution  is 
provided  only  around  and  near  the  body  surface.  Between  the  body  and  the  outer  bound¬ 
ary,  the  mesh  is  allowed  to  stretch  resulting  in  very  crude  (3  to  5  points  per  wavelength) 
meshes  near  the  outer  boundary  regions. 

The  free  space  wavelength  is  reduced  to  smaller  values  inside  a  material  (as  e  and  /r 
become  large,  the  speed  of  propagation,  c  =  goes  down,  causing  the  wavelength  to 

scale  accordingly).  Thus,  the  grid  resolution  must  take  into  account  material  properties 
to  adequately  resolve  tlie  fields  inside  material  zones. 

The  numl.)er  of  grid  points  per  wavelength  required  depends  on  the  order  of  accuracy 
of  the  numerical  scheme.  A  second-order  accurate  scheme  usually  requires  at  least  ten  grid 
])oints  per  local  wavelength.  One  may  be  al:)le  to  nse  a  higher  order  scheme  and  minimize 
the  number  of  grid  points.  However,  as  the  order  of  accuracy  goes  up,  the  scheme  will  also 
reciuire  more  computations  per  grid  point,  which  may  offset  the  execvition  savings  with 
fewer  grid  points. 

The  requirement  that  the  fields  are  resolved  accurately  with  proper  grid  resolution 
makes  GEM  problems  computationally  intensive,  requiring  large  scale  supercomputing. 
For  example,  to  compute  the  radar  cross  section  of  a  typical  aircraft  at  1  GHc,  even  if  one 
used  10  grid  cells  per  wavelength,  it  will  require  tens  of  millions  of  grid  points. 

4)  Validation 

Once  a  GEM  code  is  developed,  the  results  must  be  validated  against  known  exact 
solutions  and  carefully  tailored  experimental  data.  There  are  many  computational  issues 
such  as  grid  resolution,  location  of  the  outer  boundary,  and  accuracy  of  the  boundary 
condition  procedures  that  can  only  be  addressed  through  a  careful  study  of  many  \alida- 
tion  cases.  The  Electromagnetic  Gode  GonsortiTun  (EMGG)  has  a  list  of  validation  cases 
comprising  many  target  shapes  specifically  designed  for  validating  codes. 

The  GEM  code  has  been  extensively  tested  for  the  following  geometries. 

1)  Ganonical  objects  such  as  spheres,  cylinders,  ogives,  thin  rods,  cones,  airfoils,  and  a 

circular  disc 

2)  Almond  shaped  target 

3)  Inlets  of  various  shapes  (square,  circular,  curved,  •  •  •)  including  the  presence  of  infinite 

ground  plane 
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4)  Flat  ])la.teR  of  va.rions  j)laiiforms 

5)  Double  sphere 

6)  Complete  wing  geometries  with  layers 

7)  Finned  projectiles  and  cone -cylinder  combinations 

S)  Scattering  from  ship-  like  targets 

9)  Complete  fighter  targets 

To  demonstrate  the  accuracy  the  present  scheme,  ix'sults  are  shown  in  Figure  2  for  a 
number  of  spheres  having  different  Ka  values,  including  the  case  of  a  coated  sj^herc  and  a. 
resistive  shell  cylinder.  Using  about  10  points  per  wavelength  on  the  stirface  (expanding 
grid  is  used  from  the  body  to  the  outer  boundary),  results  are  shown  to  compare  very  well 
with  series  solutions. 
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Figur<!  la.  Spectral  a,mplitude  va.riation  as  function  of  CFL  number 


Figure  lb.  Spectral  phase  variation  as  function  of  CFL  number. 
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This  paper  will  focus  on  the  accuracy  of  the  SWITCH  code,  which  is  a  3D  curvilinear 
hybrid  finite  element  -  method  of  moments  code.  An  explicit  scheme  to  estimate  accuracy 
will  be  presented,  and  this  method  will  then  be  applied  to  a  variety  of  objects  predicted  by 
SWITCH.  Specific  test  cases  will  include:  dielectric  and  coated  metal  spheres,  cavity 
backed  patch  antennas,  and  business  card  and  conesphere  EMCC  benchmark  targets. 
The  trade  off  between  accuracy  and  computer  resources  will  be  discussed  for  the  hybrid 
approach  used  in  SWITCH.  The  advantage  of  high  accuracy  with  fewer  unknowns 
through  higher  order  basis  functions  will  also  be  demonstrated. 

A  useful  measure  for  assessing  accuracy  is  the  Pcum  X  measure,  where  x  represents  a 
percentage  between  0  and  100.  This  quantity  is  the  RCS  in  dB  for  which  x%  of  the  data  in 
a  region  is  below  this  level.  The  region  of  data  considered  is  usually  an  angular  sector  of 
a  pattern,  but  could  be  applied  to  any  set  of  data.  The  Pcum  50  and  Pcum  10  levels  will 
be  used  to  compare  SWITCH  predictions  to  either  analytical  solutions  or  measured  data. 
The  use  of  the  Pcum  10  calculation  for  comparing  data  is  helpful  since  it  avoids  a  direct 
comparison  point  for  point  in  deep  pattern  nulls  where  two  patterns  may  be  significantly 
different.  For  each  data  comparison,  an  overlay  plot  of  the  two  patterns  will  also  be 
presented  along  with  the  Pcum  50  and  Pcum  10  comparisons.  The  overlay  is  also 
important  for  assessing  accuracy  because  a  visual  check  of  the  pattern  structure  may  spot 
anomalies  between  patterns  that  may  be  smoothed  out  in  the  Pcum  evaluation. 

The  trade  off  between  accuracy  and  computer  resources  will  be  discussed  for  the 
SWITCH  code.  The  first-order  curvilinear  roof-top  basis  functions  used  in  SWITCH  have 
been  shown  to  be  efficient  basis  functions,  giving  high  accuracy  with  fewer  unknowns 
than  faceted  basis  functions  resulting  in  reduced  computer  storage.  Examples  will  be 
shown  to  demonstrate  this  efficiency.  The  SWITCH  code  also  incorporates  higher  order 
basis  functions  which  result  in  even  fewer  unknowns  and  reduced  storage.  These  higher 
order  basis  functions  usually  result  in  longer  running  times  when  a  Galerkin  method  of 
moments  approach  is  used.  The  SWITCH  code  has  encoded  a  subdomain  testing 
procedure  for  the  higher  order  basis  functions  resulting  in  a  substantial  speed  up  of  run 
time  compared  to  the  Galerkin  approach.  Both  approaches  are  encoded  in  SWITCH  and 
the  differences  in  run  time  will  be  shown. 
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Abstract 

In  this  paper  four  MM-based  codes  are  compared  for  a  set  of  canonic  and  non-canonic  scattering 
problems  as  a  function  of  several  parameters.  These  include  the  discretization  density  and  uniformity 
required  to  achieve  a  convergent  answer  for  various  quantities  such  as  the  mono-  or  bi-static  far  fields, 
near  fields,  and  surface  fields  or  currents.  Only  the  electric  field  integral  equation  (EFIE)  formulation, 
restricted  to  perfectly  conducting  scatterers,  will  be  discussed. 

The  four  MM  codes  to  be  compared  are  the  CARLOS-3D  (vers.  3.0),  CARLOS-Q,  CARLOS-SW  and 
CLOAK  codes.  These  codes  were  developed  at  McDonnell  Douglas  and  are  based  on  the  Galerkin 
implementation  of  the  EFIE  formulation  using  different  surface  discretizations  such  as  flat  triangular 
facets,  curved  quadrilateral  patches,  and  combinations  of  the  two.  The  following  basis  functions  are 
considered:  roof  top,  surface  wave,  Hermite,  Chebyshev,  and  higher  order  polynomial  expansions. 
For  standardization  the  LINPACK  LU  decomposition  is  used  to  solve  the  resulting  system  of 
equations  for  all  formulations. 


1.0  Introduction 

The  method  of  moments  (MM)  technique  for  the  solution  of  complex  radiation  and  scattering 
problems  has  achieved  universal  acceptance  as  an  extremely  robust  solution  procedure  for  Maxwell’s 
equations.  Many  codes  have  been  developed  for  one-dimensional  (wire),  2-D  and  3-D  problems, 
including  specialized  versions  used  interactively  by  designers.  The  accuracy  of  these  codes  has 
reached  such  a  state  of  development  that  high  value  range  measurements  are  calibrated  with  MM 
derived  data. 

The  nature  of  the  MM  technique  precludes  proof  of  absolute  convergence  of  the  solutions  for  a  given 
problem.  Therefore,  from  the  very  beginning  the  MM  results  were  compared  to  classical  “exact” 
results  such  as  the  Mie  solution  for  circular  cylinders  and  spheres,^  experimental  data  and  other 
numerical  solutions  for  non-canonic  geometries  when  available.  Implicit  in  comparing  these  results 
was  the  assumption  that  the  surface  (geometry)  of  the  object  under  analysis  was  accurately  described 
mathematically  when  comparing  different  analysis  techniques.  Similarly,  it  was  assumed  that 
experimental  data  was  taken  with  sufficient  care  with  regards  to  calibration,  sting  interactions,  and 
fabrication  of  the  test  object.  Computational  efficiency,  loosely  defined  as  the  CPU  time  and  memory 
resources  required  to  achieve  a  given  level  of  accuracy,  was  generally  not  considered. 


1194 


In  this  paper  we  approach  the  issue  of  fidelity  in  a  different  manner.  First  we  compare  only  MM 
techniques  with  each  other.  We  restrict  the  discussion  to  the  Galerkin  form  and  assume  that  a  given 
LU  solver  algorithm  is  used.  The  mathematical  description  of  the  scatterers  is  generated  via  a 
Unigraphics  10.0  package  from  which  the  inputs  for  the  various  codes  are  generated.  All  cases  were 
run  on  an  HP  9000^50  workstation.  This  allows  consistent  across-the-board  comparisons  to  be 
made. 


2.0  Description  of  Codes 

As  noted  above,  only  the  electric  field  integral  equation  (EFIE)  implementation  with  the  four  c^es  is 
considered.  This  restricts  the  comparisons  to  perfectly  electncally  conducting  bodies  (PEC) 
short  description  of  the  codes  under  consideration  follows.  The  major  differences  are  the  surface 
representations  and  the  basis  functions  used.  The  EFIE  formulation  is  identical  in  each  case  namely, 
it  proceeds  from  an  operator-based  implementation  that  is  invanant  as  to  geometry  or  bas^  functions 
Implementation  of  the  specific  forms  of  these  operators  varies  with  the  foregoing  geometry  and  basis 
function  descriptions.  This  is  outlined  in  some  detail  below. 

2.1  CARLOS-3D  (vers.  3.0) 

CARLOS-3D  is  a  general-purpose  MM  code  for  computing  the  scattering  from  complex  3-D  objects. 
The  code  implements  Galerkin  testing  to  solve  the  Stratton-Chu  surface  integral  equanons.  Complex 
geometries  composed  of  multiple  conducting  and  bulk  dielectnc  regions  can  be  modeled,  although  in 
this  paper  we  focus  on  PEC  cases  only.  (Refs.  1,2) 

A  major  feature  of  the  code  is  its  modularity  built  on  the  generalized  Galerkin  operators  which  are 
geometry  and  surface-representation  independent,  permitting  the  code  to  be 
advanced  basis  functions  and  parametric  surfaces.  These  operators  result  from  testing  either  the 
integral  operators  in  the  Stratton-Chu  equations,  the  equivalent  currents,  or  the  incident  fields^ 
CARLOS-3D  vers.  3.0  implements  this  with  the  Rao-Wilton-Glisson  roof-top  basis  functions.  Mamx 
elements  in  the  code  are  computed  using  a  combination  of  analytic  and  numerical  procedures  with  the 
algorithm  being  facet-based  to  eliminate  redundancy.  Self-terms  are  computed  analytically.  Near 
Ss  are  evaluated  using  singularity  extraction  with  adjustable  quadrature  fomiulas  optimizing  matnx 
fill  times  and  accuracy. 

2.2  CARLOS-Q 

Due  to  the  modular  architecture  of  CARLOS-3D,  it  is  straightfomard  to  implement  a  ™riety  of 
different  surface  representations  and  basis  functions.  A  quadnlateral  patch  (Q-patch)  fo™“ 
implemented  within  the  CARLOS- 3D  framework,  which  is 

(Ref  3)  A  parametric  surface  geometry  representation  is  employed  in  the  Q-patch  formulation  to 
subdivide  an  arbitrary  3-D  surface  into  quadrilateral  patches.  J'"® 

permits  the  direct  use  of  common  geometry  data  formats,  such  as  the  CAD,  AGM,  and  IGES  files. 

The  code  accepts  a  parametric  bi-cubic  geometiy  representation  which  *“b?'amia>ly  sim^^^^^ 
preparation  of  the  input  data  for  CARLOS-Q  m  companson  to  the  original  faoet  app™ch^ 

This  feature  is  especially  advantageous  for  multi-frequency  calculabons  since  the  discretizMion 
density  can  be  changed  without  having  to  re-mesh  the  surface,  which  is  a  costly  and  time-consum  g 
part  of  the  overall  EM  modeling  process. 

The  Q-patch  formulation  is  an  efficient  EM  modeling  technique  for  ^bitrary  3-D  surfaces  which 
employs  a  set  of  roof-top  basis  functions  to  represent  the  unknown  snrface  currents  on  each  patch. 
These  baL  functions  ar?  edge-based  and  are  similar  to  the  Rao-Wilton-Glisson  basis  functions  for 
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triangular  facets.  This  feature,  together  with  the  modularity  of  CARLOS-3D,  makes  it  possible  to 
have  a  hybrid  mesh  representation  of  the  surface  geometry  that  combines  tnangular  facets  with 
quadrilateral  patches,  not  discussed  here.  The  combination  of  the  paramemc  geome^  representation 
and  the  edge-based  basis  functions  substantially  improves  the  modeling  capability  of  complex 
geometries. 

2.3  CARLOS-SW 

CARLOS-SW  incorporates  a  set  of  higher  order  basis  functions  that  simulate  the  surface-wave  effects 
on  the  boundary  of  a  scatterer.  The  purpose  of  this  code  is  to  maximize  the  reduction  of  unknowr^  by 
employing  a  set  of  surface-wave  (SW)  basis  functions  that  efficiently  represents  the  unknown  surface 
currents  These  SW  basis  functions  consist  of  a  slowly-varying  function  multiplied  by  a  phase  factor 
that  resembles  the  surface  diffraction  terms  developed  for  high  frequency  approximations.  The 
slowly-varying  function  is  expressed  in  terms  of  Chebyshev  functions.  In  addition  a  physical  optics 
term  is  added  to  the  current  representation  to  further  reduce  the  number  of  unknowns  required  in  the 
calculation.  Details  of  this  analysis  can  be  found  in  Ref.  4. 

We  have  validated  CARLOS-SW  for  both  2-D  and  3-D  geometries.  Substantial  reduction  of  the 
number  of  unknowns  was  observed.  For  example,  the  scattering  from  a  20  X  wide,  2-D  curved  surface 
required  only  26  unknowns  for  CARLOS-SW,  while  250  unknowns  were  required  by  a  conventional 
MM  code, 

2.4  CLOAK 

CLOAK  is  a  general  purpose  MM  code  that  operates  on  exact  surface  descriptions  of  the  scatterer, 
represented  by  combinations  of  B-surfaces  and  parametric  bi-cubic  curvilinear  patches.  The  basis 
functions  are  higher  order  polynomial  expansions  which  map  directly  onto  the  surface  isoparametric 
lines  With  this  higher  order  formulation,  a  typical  discretization  density  is  roughly  four  basis 
functions  per  wavelength.  Of  course,  the  double  surface  integrations  required  to  calculate  matnx 
elements  result  in  greater  matrix  fill  time  than  with  simpler  basis  functions. 

Although  CLOAK  was  originally  developed  independently,  the  defining  features  of  the  code  have 
recently  been  integrated  into  the  CARLOS-3D  framework.  A  major  advantage  of  using  an  exact 
surface  representation  is  that  the  geometry  description  is  frequency  independent,  meaning  that  it  only 
needs  to  be  generated  one  time.  Other  techniques  require  re-discretization  for  each  different 
frequency  regime,  which  as  noted  previously,  can  be  a  costly  step. 


3.0  Measurands 

Four  principal  measurands  are  chosen  for  comparison.  They  are:  the  density  of  basis  functions 
required  to  span  a  given  scatterer,  the  total  number  of  unknowns  (degrees  of  freedom)  associated  wit 
this  sampling,  the  matrix  fill  time,  and  the  accuracy  achieved  for  different  classes  of  “observables  to 
be  elucidated  later.  Note  that  the  matrix  storage  requirements  for  all  of  these  techniques  is  directly 
related  to  the  total  number  of  unknowns. 

3.1  Sampling  Density  and  Number  of  Unknowns 

In  discussing  the  sampling  density  required  for  a  given  technique,  two  issues  are  important.  First,  the 
shape  of  the  surface  must  be  discretized  and  represented  in  a  fashion  that  will  accurately  represent  the 
true  shape  of  the  scattering  geometry.  Second,  this  geometiy  sampling  must  be  consistent  with  the 
accuracy  achievable  by  the  basis  function  used  in  order  to  capture  the  correct  variations  of  the 
unknown  surface  currents.  In  general,  the  simpler  the  surface  representation  and  associated  basis 
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functions,  the  easier  it  is  to  compute  the  MM  matrix  elements.  When  more  complicated  surface 
representations  and  basis  functions  are  used,  more  work  must  be  done  in  computing  the  inchvidua 
matrix  elements.  However,  these  more  complicated  techniques  also  result  in  a  reduction  of  the 
number  of  matrix  elements  which  must  be  computed,  and  ultimately  in  the  overall  order  of  the  MM 
system  that  needs  to  be  solved. 

Each  of  the  four  codes  discussed  in  this  paper  uses  a  different  surface  representation  and  set  of  basis 
functions.  Both  the  surface  representations  and  basis  functions  can  be  charactenzed  in  two  broad 
categories'  flat  facet-based  representations  versus  curved  discretizations,  and  linear  basis  functions 
versus  higher  order  polynomial  basis  functions.  The  flat-facet  surface  representation  essentially 
ignores  any  curvature  and  relies  on  higher  sampling  density  to  accurately  capture  this  information, 
while  the  curved  surface  discretizations  can  capture  the  curvature  to  any  desired  degree  of  accuracy. 
Similarly,  the  linear  basis  functions  require  higher  sampling  density  in  order  to  capture  rapid  current 
oscillations,  while  the  higher  order  basis  functions  have  much  of  this  built  into  the  basis  functions 
themselves  and  hence  allow  more  coarse  sampling. 

CARLOS-3D  uses  the  Rao-Wilton-Glisson  roof-top  basis  functions  applied  to  a  triangularly-faceted 
surface  description.  With  this  method,  a  given  triangle  pair  shanng  a  common  edge  forms  the  basis 
function  with  the  unknown  current  coefficient  being  associated  with  the  common  edge.  1  he  basis 
function  allows  the  current  to  vary  linearly  from  zero,  at  the  vertices  opposite  this  common  edge,  to  a 
maximum  value  at  some  point  along  the  common  edge.  Since  the  surface  represenmtion  and  basis 
functions  are  both  linear,  typically  this  formulation  requires  a  higher  sampling  density  to  achieve  a 
given  solution  accuracy,  usually  on  the  order  of  100  triangles  distributed  per  square  wavelength,  or  8  - 
10  basis  functions  in  a  linear  dimension. 


CARLOS-Q  enhances  the  geometry  modeling  capability  of  CARLOS-3D.  This  technique  employs 
the  parametric  bi-cubic  surface  representation  to  provide  direct  access  to  the  geometry  data  from 
commonly  used  geometry  software  packages.  In  addition,  the  edge-based  basis  ^ed  m 

CARLOS-Q  are  similar  to  the  roof-top  functions  in  the  tnangular  facet-based  CARLUb-JU 
formulation  which  allows  hybrid  formulations  of  the  two  different  techniques.  The  curvilinear 
(quadrilateral)  patches  used  in  the  Q-patch  formulation  not  only  provide  an  accurate  surface 
representation  but  also  help  to  reduce  the  number  of  unknowns  by  eliminating  the  unnecessary  ones. 

CARLOS-SW  utilizes  entire  domain  basis  functions  which  take  advantage  of  asympmtic 
approximations  of  surface  diffraction  terms  in  order  to  reduce  the  number  of  unknown^  The  surface 
of  the  scatterer  is  first  subdivided  into  smooth,  irregular  regions,  where  the  smwth  surface  is  treated 
as  one  curved  patch.  The  size  of  the  patch  determines  the  required  number  of  basis  functions.  We 
have  observed  the  solutions  converges  for  as  little  as  5  basis  functions  per  wavelength  for  surface 
dimensions  less  than  10  X. 

The  CLOAK  code  utilizes  curved  bi-cubic  patches  or  B-surfaces  to  represent  the  scatterer.  Each  of 
these  curved  surfaces  is  then  partitioned  into  a  quilt  of  curved  quadrilateral  sub-patches,  about  one- 
quarter  wavelength  on  a  side,  and  higher-order  polynomial  basis  functions  ^e  applied.  Thus,  as  tbe 
frequency  changes,  CLOAK  automatically  adjusts  the  density  of  basis  functions  applied  to  the 
geometry.  The  total  number  of  basis  functions  is  approximately  twice  the  number  of  sub-patches  bo, 
if  the  quilt  is  composed  of  sub-patches  sampled  at  a  linear  density  of  4  per  wavelength  (16  sub-patches 
per  square  wavelength),  the  total  number  of  unknowns  per  square  wavelength  is  approximately  il. 


3.2  Matrix  fill 

Accurate  numerical  evaluation  of  the  matrix  elements  arising  in  the  MM  technique  js  a  key 
determinant  of  the  resulting  accuracy  of  the  overall  results  of  the  conaputations  ^ 

technique,  the  mathematical  expressions  of  the  matnx  elements  for  3-D  objects  require  in  general  a 
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four-fold  surface  integration  to  be  carried  out.  This  integration  is  obviously  a  function  of  the  surface 
representation  used  for  the  object  as  well  as  the  basis/testing  functions  used.  Additionally,  the 
particular  type  of  quadrature  used  will  strongly  impact  both  the  accuracy  and  fill  time. 

Extensive  investigations  have  been  carried  out  to  achieve  the  highest  accuracy  possible,  particularly 
for  the  self- term  (diagonal)  elements  of  the  matrix.  Various  numerical  and  semi-analytical  methods 
have  been  proposed  including  singularity  extraction.  The  four  codes  being  compared  use  different 
methods  for  this  vital  step  of  the  numerical  implementation  of  the  MM  technique.  In  principle,  any 
degree  of  accuracy  can  be  achieved  via  quadrature  methods  at  the  cost  of  much  enhanced  computation 

times.  Since  matrix  fill  is  proportional  to  it  is  imperative  to  adopt  the  most  accurate  yet 
computationally  efficient  integration  technique. 

3.3  Accuracy 

A  single  test  case  was  chosen  to  provide  a  quantitative  comparison  of  the  accuracy  of  the  four 
techniques.  Although  the  “accuracy”  achieved  by  a  given  code  will  depend  on  many  factors, 
including  the  geometry  itself,  the  comparison  is  intended  to  provide  the  reader  with  a  feel  for  the 
convergence  properties  of  the  different  numerical  techniques  which  can  be  weighed  against  their 
relative  computational  costs.  To  that  end,  the  geometry  chosen  is  the  well-known  business  card,  the 
EMCC  test  case  number  4  (Ref  5).  This  case  is  a  thin  metal  plate,  3. 5 A  x  2A ,  with  an  azimuthal  cut 
at  10°  elevation  from  the  surface  of  the  plate.  Since  the  geometry  is  symmetric  about  the  .x  and  y  axes, 
a  monostatic  cut  from  0  =  0°  to  0  =  90°  suffices,  where  </)  =  0°  corresponds  to  incidence  along  the 
short  side  of  the  plate. 

In  the  subsequent  comparisons  between  the  different  codes,  a  baseline  “exact”  solution  is  used  to 
quantify  a  measure  of  “accuracy.”  This  “exact”  solution  is  a  numerical  entire  domain  MM  solution, 
which  used  105  basis  functions  in  the  long  direction  of  the  plate  and  30  in  the  short  direction,  a 
density  which  is  well  beyond  the  necessary  limit  for  convergence.  Figure  1  provides  a  comparison  of 
this  “exact”  solution  with  measured  data  provided  by  the  EMCC.  Although  the  computed  data  differs 
from  the  measurements  in  some  areas,  the  measured  data  is  not  without  its  own  inaccuracies. 


Each  of  the  four  codes  was  used  to  generate  a  set  of  three  different  solutions  for  this  test  case, 
representing  low,  medium,  and  high  density  sampling  in  order  to  demonstrate  the  convergence 
properties  for  each  method.  These  convergence  results  are  shown  in  Figures  2-5. 
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Figure  5;  Convergence  of  the  CLOAK  code  for  EMCC  test  case  number  4. 


The  accuracy  results  of  these  12  different  cases  are  summarized  in  Tables  I  -  IV,  where  the  accuracy 
measure  is  computed  as  the  root-mean-square  error  between  the  given  numerical  simulation  and  the 
“exact”  entire  domain  solution,  normalized  by  the  RMS  value  of  the  “exact”  solution.  Note  that  these 
calculations  were  performed  on  the  normalized  cross  section  values,  not  on  a  logarithmic  (dB)  scale. 
The  column  labeled  “Basis  FcnsA”  represents  the  sampling  density  in  a  linear  dimension  and  the 
column  “Total  Unknowns”  gives  the  order  of  the  overall  MM  matrix  equation  that  was  solved.  Note 
that  matrix  fill  time  is  not  presented  here  due  to  the  many  factors  which  influence  this  measurand, 
such  as  the  particular  choice  of  quadrature  formulas  used  for  the  numerical  integration. 


Table  I 

Summary  of  CARLQS-3D  convergence. 


Basis 

FcnsA 

Total 

Unknowns 

Error 

HH 

VV 

4 

360 

0.574 

0.140 

6 

688 

0.275 

0.091 

8 

1300 

0.173 

0.062 

Table  Ul 

Summary  of  CARLOS-SW  convergence. 


Basis 

FcnsA 

Total 

Unknowns 

Error 

HH 

VV 

3 

104 

1.461 

0.613 

5 

264 

0.567 

0.172 

7 

480 

0.184 

0.246 

Table  II 

Summary  of  CARLQS-Q  convergence. 


Basis 

FcnsA 

Total 

Unknowns 

Error 

HH 

VV 

3 

126 

1.151 

0.165 

5 

313 

0.319 

0.068 

10 

1345 

0.084 

0.023 

Table  IV 

Summary  of  CLO  AK  convergence. 


Basis 

FcnsA 

Total 

Unknowns 

Error 

HH 

VV 

2.5 

152 

0.065 

0.041 

4 

318 

0.031 

0.007 

6 

642 

0.026 

0.007 

1200 


5.0  Summary 

Four  different  Galerkin  MM  formulations  for  arbitrary  3-D  geometries  have  been  described  A 
comparison  was  made  with  respect  to  geometry  modeling  accuracy,  matrix  fill  time,  and  far-tield 
solution  accuracy.  It  was  noted  that  the  use  of  a  generalized  operator  notation  in  the  MM  solution 
permits  the  incorporation  of  different  geometry  representations  and  basis  functions  within  a  single 
code  architecture.  This  also  allows  the  incorporation  of  hybrid  methods  which  use  combinations  ot 
the  different  formulations  discussed  in  this  paper. 

Although  this  paper  has  only  presented  results  for  the  “business  card”  test  case,  additional  quantitative 
comparisons  have  been  made  with  the  four  codes.  Some  of  these  include  far-field  smutions  for  a 
sphere  a  curved  shell,  the  NASA  almond  and  ogive,  and  a  complex  curved  surface.  These  results 
provide  further  quantitative  assessment  of  the  relative  performance  of  the  codes  compared  m  this 
paper. 
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REQUIRING  QUANTITATIVE  ACCURACY  STATEMENTS  IN  EM  DATA 

E,  K.  Miller,  Stocker  Visiting  Professor 
Ohio  University,  Athens,  OH  45701 , 614-593-1603 


ABSTRACT 

Of  all  the  activities  associated  with  developing  computer  models  lor  any  application  including 
computational  electromagnetics  (CEM),  validation  must  be  considered  the  most  critical  over  the  long 
term.  Without  quantitative  assurance  that  the  results  produced  by  a  model  are  commensurate  with  the 
needs  of  the  intended  application,  there  will  always  remain  questions  concerning  whether  analysis  and 
design  evaluations  based  on  that  model  can  be  relied  upon.  Unfortunately,  in  contrast  to  the  situauon  in 
years  past  when  experimental  measurements  were  the  principle  source  of  such  data  and  error  estimates 
were  routinely  expected  to  accompany  them,  the  growth  in  computer-model  results  seems  to  be 
characterized  by  an  almost  complete  lack  of  meaningful  accuracy  statements.  As  a  means  of  correcting 
this  problem,  it  is  first  proposed  that  some  modest  initial  steps  be  taken,  such  as  requiring  error  estimates 
to  be  included  with  all  computed  data  published  in  the  CEM  literature.  The  rest  ol  the  paper  provides 
additional  background  and  considers  aspects  of  model  validation  as  a  means  ol  improving  CEM  overall 
model  utility.  Several  of  die  author’s  papers  on  this  topic  are  listed  as  references. 


A  PROPOSAL  FOR  QUALITY  CONTROL  OF  PUBLISHED  NUMERICAL  RESULTS 

With  the  proliferation  of  quantitative  results  in  EM  arising  from  ever  more  sophisticated  experimental 
apparatus  and  from  almost  universal  use  of  computer  models,  the  need  to  accompany  such  data  with 
accuracy  (or  error)  estimates  is  becoming  acute.  It’s  not  unusual  to  find  published  articles  that  contain 
numerical  results  having  no  independent  confirmation  of  their  accuracy  or  even  lacking  any  discussion  of 
how  good  the  author  considers  the  results  to  be.  When  accuracy  is  considered,  it  is  becoming  quite 
common  to  find  results  from  another  numerical  computation,  such  as  a  “moment-method  ’  model, 
referred  to  as  the  “exact”  solution  without  any  accompanying  measured  data.  Furthermore,  in  such 
ca.ses.  the  usual  discussion  finds  only  qualitative  descriptors  such  as  “good  agreement”  or  “excellent 
results”  being  used.  Even  when  accuracy  is  quantitatively  addre.ssed,  the  comments  made  about  it  are 
usually  inconsistent  and  incomplete.  For  example,  one  author  might  consider  only  impedance  errors 
important,  another  might  view  peak  gain  the  primary  quantity  of  interest,  and  still  another  shifts  in 
pattern  nulls  the  limiting  factor. 

Consequently,  with  few  exceptions,  the  reader  is  left  with  only  a  fuz/y,  qualitative  interpretation  ol 
accuracy  and  error  concerning  the  results  actually  presented  in  the  publication,  let  alone  the  accuracy  and 
error  characteristics  of  the  process  (computational  or  experimental)  by  which  that  data  has  been  generated 
if  that  process  were  to  be  used  for  substantially  similar  problems.  The  situation  only  gets  more  fuzzy 
when  the  reader  wonders  about  the  accuracy  of  that  process  and  its  error  characteristics  when  used  tor 
substantially  different  problems.  This  paper  addresses  one  approach  to  alleviating  this  problem. 
Ba.sically,  the  idea  is  to  implement  a  policy  that  requires  an  author  to  add  a  separate  section  following  the 
introduction  to  any  article  in  which  computed  results  are  prc.sented.  In  that  section,  statements  ol  the 
following  general  form,  with  whatever  elaboration  is  appropriate,  will  be  included: 

“The  results  presented  here  are  estimated  to  be  accurate  to _ ,’  where 

quimtitative  statements  such  as: 

“The  error  in  peak  gain  is  <  0.5  dB;  or 

“Nulls  in  the  scattering  pattern  are  located  to  within  2  deg;  or 

“Input  impedance  is  obtained  to  within  5  ohms;  or 
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“Etc, 

arc  made.  This  would  be  followed  with  the  mandatory  additional  statements  below. 

“This  estimate  is  based  on  using  the  following  kind(s)  of  validation  excrcise(s) 

where  whatever  experimental,  analytical  and/or  computational  validation 
that  has  used  is  summarized. 

“The  kinds  of  problems  for  which  the  above  error/accuracy  statement  can  be  made 

include _ ”  where  tlie  problem  geometry  and  electrical  characteristics  are 

summarized. 

“Nominal  sampling  densities  required  to  achieve  these  estimated  accuracies  tire 

/’  where  wavelength-dependent  and/or  geometry-dependent  values  are  given. 

“The  t)peration  count  needed  to  exercise  the  model  reported  here  is  estimated 
nominally  to  be  Ar^, ...”  or  some  equivalent  statement,  where  numerical  values 
for  A  and  x  are  given.  While  providing  computer  running  times  is  acceptable,  that 
alone  is  not  enough  because  there  is  such  a  variation  in  computer  architectures  that 
model-to-model  comparisons  based  on  running  time  are  not  very  inlormative. 

“. .  .  and  the  variable  storage  need  to  exercise  this  model  is  estimated  nominally  to 
be  where  again  numerical  values  for  B  and  y  are  given. 

In  conclusion,  we  note  that  the  above  procedure  is  aimed  at  validating  the  results  obtained  Irom  a  Itiven 
model  when  used  for  a  specific  application.  This  is  much  different  from  validating  the  model  use  I 
which  is  a  more  general  and  open-ended  problem  and  one  to  which  the  above  procedures  would 
contribute  but  not  resolve.  Both  problems  arc  discussed  further  below.  Also  note  that  simply  including 
independent  data  obtained  from  measurement,  analysis  or  computation  would  itself  be  sufficient  to 
satisfy  the  above  quantitative  validation  requirements  so  long  as  the  check  data  itsell  has  associated  error 
estimates  from  which  such  inferences  can  be  drawn  or  when  error  estimates  are  derived  by  the  author(s) 
from  comparing  the  independent  data  and  the  model  results  in  question.  The  above  proposal  constitutes 
the  main  thesis  of  this  presentation.  In  order  to  round  out  the  discussion  and  provide  lurthei 
background,  we  now  examine  the  problem  of  model  validation  in  more  detail  . 


In  this  section  we  consider  the  steps  involved  in  developing  a  CEM  model  and  the  role  ol  validation  in 
that  process  and  then  examine  the  general  requirement  of  verifying  and  validating  the  results  obtained  and 
the  model  itself.  There  being  no  uniformity  of  useage,  the  terms  “model,”  “code,”  and  “program  are 
used  interchangably  here  as  names  for  the  software  that's  used  for  EM  compulations. 

Developing  a  CEM  Model  . 

The  process  of  developing  a  computer  model  involves  a  small  number  of  basic  steps  whatever  particular 
details  are  involved.  Bccau.se  these  .steps  are  universally  encountered,  it  is  worthwhile  to  summanze 
them  as  follows: 

Conceptualization-Encapsulating  observation  and  analysis  in  terms  of 
elementary  physical  principles  and  their  mathematical  description  (e.g.,  the  idea  that 
a  Green's  function  can  he  used  to  represent  the  lields  of  arbitrary  source 
distributions). 

Formulation-"Fleshing  out"  of  the  elementary  description  into  a  more  complete. 
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formally  solved,  mathemalical  represenlation  (e.g.,  dcvclopmcni  ol'an  integral 
equation  from  a  source  integral  and  required  boundary  conditions  on  field 
behavior). 


Implementation--Transforming  the  fonnulation  into  a  computer  algorithm  using 
various  numerical  techniques  (e.g.,  using  the  method  of  moments). 

Computation-  Defining  the  model  and  obtaining  quantitative  results  (the  "crank¬ 
turning"  stage). 

Approximation-Simplifying  operations  and  assumptions  that  can  arise  at  any 
step  in  the  process  of  model  development  and  application  (e.g.,  reducing  a  surface 
integral  to  a  line  integral  by  using  the  "thin-wire"  approximation  or  approximating  a 
solid  conductor  with  a  wire  mesh). 

Validation-Determining  the  numerical  and  physical  credibility  of  the  computed 
results  (using  analytical,  experimental,  and/or  other  numerical  results  for 
comparison  with  the  computer  model  being  employed). 

For  any  applications-oriented  software,  the  most  time-consuming  and  laborsome  of  the.sc  steps  is  the 
last,  that  of  validation.  This  is  becoming  increasingly  the  case  as  growing  computer  power  lias  expanded 
the  size,  complexity,  and  volume  t)f  problems  that  are  routinely  modeled.  Whereas  computer  resources 
available  in  the  1960s  limited  the  amount  of  data  needed  to  descrihe  problems  and  represent  the  results, 
there  has  been  an  explosive  growth  in  the  scope  of  applications  as  summarized  in  Table  I. 

Long  after  work  on  the  model  has  been  completed,  questions  will  continue  to  arise  about  its 
performance.  Such  questions  include: 


•is  a  given  result  valid? 

•can  the  model  be  u.scd  reliably  for  a  given  problem? 

•what  might  be  the  numerical  accuracy  and  physical  relevancy  of  the  results  that 
tuc  obtained';' 

These  questions  become  especially  important  as  modeling  moves  from  a  primarily  research  environment 
to  one  which  involves  an  increasing  emphasis  on  analysis  and  design  applications.  The  dilficulties 
caused  by  uncertainty  over  model  performance  also  increase  with  the  expanding  proliferation  of 
modeling  codes  and  computational  resources  becoming  available.  The  question  of  perceived  model 
validity  is  of  particular  concern  in  that  it  can  lead  to  correct  results  being  rejected  because  of  unwan  anted 
skepticism  or  acceptance  of  incorrect  results  because  of  misplaced  confidence.  Either  outcome  is 
undesirable  and  both  should  be  avoided  by  developing  the  validation  procedures  needed  for  an 
appropriate  level  of  user  conl'idence  to  be  achieved. 

Without  essentially  "exact"  results  to  serve  as  benchmarks,  there  will  always  be  some  lingering  doubts 
regarding  the  validity,  let  alone  accuracy,  of  computer  models.  Unfortunately,  as  is  well  known,  there 
are  few  dosed-lorm,  exact  solutions  available  from  classical  electromagnetics.  For  a  3D  computer  model 
to  match  results  for  a  spherical  body  is  hardly  convincing  anymore  that  the  same  model  will  wt)rk  lls  well 
for  a  more  arbitrary  body  geometry.  Checks  of  the  kind  provided  by  the  sphere  can  be  regarded  only  as 
necessaiy,  but  not  sufficient,  conditions  for  solution  validity.  But  without  reference  solutions  to  provide 
benchmark  results,  quantification  of  computer-model  accuracy  and  validity  6)r  the  most  part  will 
continue  to  remain  iin  open  question. 

Verification  and  Validation 

The  term  “validation”  as  generally  used  actually  covers  two  related,  but  di.stinctly  different,  aspects  of 
computation,  verification  and  validation  and  unless  otherwise  specified  are  both  considered  here  to  be 
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included  under  validation.  Verification  determines  that  a  modeling  code  produces  results  consistent  with 
its  design  while  validation  establishes  how  well  its  results  conform  to  physical  reality.  Both  are 
ingredients  essential  to  performing  reliable  modeling  computations.  Verification  is  a  necessary,  but  not 
sufficient,  condition  for  achieving  acceptable  code  performance,  while  validation  determines  how  reliably 
a  given  code  can  be  applied  to  physically  meaningful  problems.  Clearly,  the  latter  aspect  of  code 
performance  cannot  be  reached  without  confirming  the  former  which  is  why  validation  is  the  more 
general  term. 

Computational  checks  at  various  points  in  the  computation  would  be  advantageous  in  establishing 
quantitative  measures  of  code  performance  with  respect  to  both  verification  and  validation.  The.sc  checks 
would  address  such  problems  as: 

1)  moving  codes  between  computers; 

2)  confirming  the  continued  valid  operation  of  the  code  over  time  on  a  given 

computer,  and; 

3)  guiding  the  user  concerning  the  validity  of  the  computed  results. 

Computational  models  would  ideally  also  include  features  that  support  "dialable"  accuracy  to  permit  an 
explicit  tradeoff  between  the  cost  of  the  computation  and  the  accuracy  of  the  results. 

The  first  step  in  assessing  computational  accuracy  stems  from  the  two  sources  of  error  in  any  modeling 
exercise.  These  are  the  physical  modeling  error  (E^)  which  arises  from  approximating  the  physical 

problem  of  interest  with  some  idealized  mathematical  representation,  and  the  numerical  modeling  error 
(E|.|)  which  occurs  because  only  an  approximate  numerical  solution  is  obtained  to  that  idealized  mode. 

Determining  E^  will  require  access  to  measured  data  because  most  problems  require  some  physical 

approximation,  such  as  representing  a  smoothly  curved  object  by  plane,  triangular  lacets.  Given 
adequate  computational  resources,  E^  can  always  be  made  smaller  than  E^.  The  essence  ol  the 

verification  and  validation  approach  outlined  is  to  develop  a  protocol  lor  systematically  and  consistently 
estimating  E^^  in  response  to  the  three  points  above. 

Several  options  could  be  considered  for  this  purpose,  but  they  are  rarely  utilized  in  current  modeling 
codes.  For  items  (1)  and  (2)  above,  for  example,  it  would  be  advantageous  to  users  if  model  developers 
were  to  include  a  set  of  precomputed  test  cases,  including  the  model  input;  resulLs  at  viirious  stages  tif  the 
computation;  and  the  final  observables  such  as  radar  cross  sections  and/or  thermal  emissions. 
Concerning  item  (3),  including  a  user  option  to  exercise  various  validation  checks  that  might  range  from 
testing  j'ar-field  reciprocity,  to  evaluating  boundary  eiTors,  or  even  comparing  results  from  two  dillerent 
numerical  models  would  be  extremely  helpful.  Finally,  it  would  be  especially  valuable  to  more  casual 
users  if  a  code  offered  the  modeler  a  quantitative  "figure  of  merit"  (FoM)  to  indicate  the  reliability  ol  the 
computed  results. 

Clearly,  verification  and  validation  options  range  from  being  quite  easily  implemented  to  posing 
significant  research  challenges.  As  problem  complexity  and  the  associated  total  FLOP  count  continue  to 
increase  with  faster  computers,  such  options  will  become  increasingly  more  essential  t()  assisting  users  in 
achieving  effective  code  useage  in  analysis  and  design.  We  now  examine  some  specific  approaches  to 
realizing  a  validation  ethic  or  protocol. 


ONE  APPROACH  TO  VALIDATION 

It  might  be  helpful  at  this  point  to  clarify  what  validation  should  mean  m  a  modeling  context.  Among  the 
definitions  given  by  the  1975  American  Heritage  Dictioiiiuy  for  validate  are  the  following. 
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-to  declare  or  make  legally  valid;  to  mark  with  an  indication  of  official  sanction;  to  substantiate 
or  verif\\ 


while  under  “verify”  we  find 

-to  confirm  or  substantiate  in  law  by  oath;  to  establish  the  truth,  accuracy  or  reality  of, 

with  the  most  relevant  of  both  meanings  provided  by  the  last  statement  concerning  the  “truth,  accuracy  or 
reality  of.”  Proceeding  with  the  understanding  that  the  inclusive  term  validation  means  to  establish  the 
truth,  accuracy  or  reality  of  model  performance,  several  key  issues  arise  as  tire  discussed  briefly  below. 

What  Problems  and  Solutions?— The  first  decision  to  be  made  in  model  validation  is  the 
selection  of  appropriate  test  problems.  Although  numerical  validation  can  (and  should)  be  performed 
within  a  model  itself  using  internal  checks  as  di.scussed  below,  a  more  logical  starting  point  for  mode! 
validation  is  the  use  of  external  data  or  checks.  From  an  analytical  viewpoint,  we  need  to  consider  for 
which  problems  are  an.swers,  preferably  of  known  accuracy,  available  to  serve  as  independent  sources  of 
results?  Alternatively,  are  there  certain  kinds  of  problems  for  which  experimental  data  of  needed 
accuracy  can  be  obtained?  Furthermore,  from  among  that  set  of  candidate  analytical  and/or  experimental 
problems,  which  provide  the  most  appropriate  testing  of  model  capabilities?  For  example,  while  one 
way  ol'  validating  a  wire  code  such  as  NEC  can  be  to  compare  results  for  scattering  from  a  wire-mesh 
model  of  a  sphere  with  the  MIE  series,  that  w-ould  not  be  especially  relevant  f\)r  determining  the  code’s 
performance  when  used  for  modeling  wire  antennas.  As  discussed  further  below,  we  suggest  that  the 
problems  and  solutions  that  are  selected  for  validation  purposes  might  be  usefully  assigned  to  one  of  two 
categories,  described  as  primary  and  secondaiy  benchmarks. 

What  Comparisons?— A  second  decision  concerning  model  validation  involves  which 
quantities  are  to  be  u.sed  for  this  purpose.  Most  obviously,  these  quantities  should  include  phy.sical 
obser\'ables  for  which  measured  data,  at  least  in  principle,  can  be  obtained.  But  mathematical  quantities 
and  relationships  which  might  be  essentially  inaccessible  using  measurements  might  also  be  useful  as 
sources  of  data  for  validation  purposes.  For  example,  the  eigen  values  and  eigen  vectors  of  the  moment- 
method  impedance  matrix  might  be  candidates  for  use  in  model  validation,  but  they  are  not  directly 
measurable  in  general. 

How'  Accomplished?— Finally,  we  must  carefully  consider  how  the  quantities  chosen  for  use 
in  model  validation  arc  to  be  compared,  and  over  what  range  of  variables  and  parameters  the 
comparisons  should  be  performed?  The  spatial  or  temporal  variation  of  a  given  field  quantity  might  be 
appropriate  for  some  applications.  On  the  other  hand,  a  result  derived  from  integrated  measures  of  such 
quantities,  for  example  total  scattered  power,  might  be  more  relevant  for  other  purposes.  The  former 
approach  might  be  called  microscopic  in  that  the  fine  structure  of  the  solution  is  being  examined,  while 
the  latter  could  be  described  as  macroscopic  because  it  provides  a  less  detailed  but  broader  means  of 
comparison. 

'fhese  various  issues  are  mu  easily  settled  and  will  require  thoughtful  consideration  and,  most  likely, 
systematic  refinement  as  procedures  for  model  validation  evolve.  We  are  suggesting  essentially  that  an 
experimental  protocol  be  developed  for  this  purpose.  This  protocol  would  set  down  clearly  defined 
procedures  for  validating  pre.sent  and  future  models  in  an  agreed-upon  waiy  that  is  both  physically  and 
numerically  relevant  to  intended  applications. 


VALIDATING  USING  INTERNAL  AND  EXTERNAL  CHECKS 

Wc  noted  above  that  essentially  two  kinds  of  procedures  can  be  u.sed  to  establish  some  quantitative 
measure  ofcode  validity: 
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1)  Internal  Validation,  a  check  that  can  be  made  concerning  solution  validity 
within  the  model  itself;  iuid 

2)  External  Validation,  a  check  that  utilizes  information  from  other  sources 
which  could  be  analytical,  experimental  or  numerical. 

Internal  Checks-Existing  computer  models  often  do  not  perform  internal  checks  on  the  results 
they  produce,  but  instead  leave  that  as  an  exercise  for  the  user.  For  example,  NEC  could  provide  and 
indeed  has  been  exercised  to  give  various  kinds  of  checks  relating  to  power  balance,  reciprocity  and 
boundary-condition  matching.  But  the  software  to  perform  the  wide  range  of  internal  checks  that  might 
be  most  useful  for  a  given  application  is  usually  not  completely  implemented  in  a  code,  but  instead  may 
need  to  be  "patched  in"  by  the  user  for  a  particular  problem  and  check.  It  would  be  extremely  valuable  if 
a  variety  of  such  checks  were  to  be  built  into  the  code  by  the  developer  so  they  were  available  to  be 
exercised  as  desired  by  the  modeler. 

As  a  particular  example  of  the  possible  applications  of  internal  checks,  consider  the  case  when  a  problem 
new  to  the  modeler  is  encountered  and  the  initial  results  are  obtained.  Present  practice  usually  involves 
"eye-balling"  the  data  to  see  if  it  "feels"  right,  perhaps  having  first  run  some  documented  test  cases  to 
verify  code  performance.  Since  these  test  cases  would  not  likely  resemble  the  new  problem,  their 
successful  solution  might  not  provide  much  insight  concerning  the  new  results.  If,  however,  a  series  oi 
checks  built  into  the  code  could  then  be  exercised  at  the  modeler's  discretion  to  verity  that  conditions 
necessary  for  a  valid  solution  of  Maxwell's  Equations  are  satisfied,  confidence  in  the  model's  numerical 
reliability  could  be  more  readily  established.  These  checks  might  range  from  fairly  exhaustive,  such  as 
computing  boundary  fields  to  determine  how  well  boundary  conditions  arc  satislied,  to  tairly  simple, 
such  iLs  evaluating  the  degree  to  which  reciprocity  and  power  conservation  iire  demonstrated.  They  could 
only  be  viewed  as  necessary  but  not  sufficient  conditions  for  solution  validity,  and  could  only  involve 
such  behavioral  aspects  as  arc  not  implicit  in  the  model  already  (e.g.,  some  formulations  produce 
symmetric  matrices  so  that  histatic  scattering  and  transmit-receive  reciprocity  are  assured  analytically). 
Developing  a  figure-of-merit  from  the  results  of  such  checks  that  would  provide  a  "quality  factor"  (or 
more  if  application-specific  measures  are  useful)  for  the  solution  in  a  single  number  seems  not  only 
feasible  but  highly  desirable. 

External  Checks-The  second  kind  of  check  involves  use  of  independent  data  from  other 
sources.  Perhaps  the  most  convincing  overall  is  experimental  data,  but  analytical  or  numerical  results 
should  be  comparably  useful.  Indeed,  one  ol  the  most  convenient  computational  checks  would  be 
provided  by  a  code  that  permits  two  dil'ferent  numerical  models  to  be  developed  for  the  same  problem, 
for  example  by  incorporating  user-selectable  basis  and  weight  functions.  For  greatest  utility,  such 
checks  ideally  should  not  be  microscopic  or  of  single-point  nature,  e.g.  a  comparison  of  results  for  input 
impedance  at  a  single  frequency.  This  is  because  experience  shows  that  computer  models  produce 
results  that  exhibit  apparently  slight  frequency  shifts,  angle  shifts  or  spatial  shilLs  in  tield  quantities  with 
respect  to  "exact"  solutions,  or  even  other  computer  models.  When  the  ellects  oi  such  shills  are 
observed  near  maxima  in  the  response  ol  interest,  they  may  appear  relatively  insigniiicant,  but  when 
examined  near  deep  minima  or  even  nulls,  the  differences  can  become  unbounded.  Consequently, 
macroscopic  or  global  comparisons  are  usually  more  meaningful,  but  even  they  may  not  be 
straightforward  to  interpret.  If  the  shifts  mentioned  are  observed,  it  would  seem  more  appropriate  to 
develop  a  correlation  measure  such  as  computing  the  minimum  squared  difference  between  the  two 
results  as  they  are  shifted  along  the  axis  of  the  common  variable,  rather  than  simply  doing  an  absolute 
differencing.  For  other  models  and  applications,  the  results  may  be  even  less  directly  comparable,  as  is 
the  case  for  IE  -and  DE-based  models.  Some  work  is  needed  in  the  general  area  of  how  results  from  two 
different  representations  of  the  same  problem  can  be  most  meaningfully  compared. 


KINDS  OF  MODELING  ERRORS 
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Error  types  and  en  or  sources  are  discussed  here  since  the  type  of  error  and  its  source  dictate  what  kinds 
of  validation  metrics  might  he  most  appropriate. 

Types  of  Errors-The  error  type  refers  to  the  general  effect  produced  on  the  modeling  process 
as: 


Type  0  errors-Type-O  errors  keep  a  program  from  running  to  conclusion, 
and  are  therefore  the  most  obvious  w'hen  they  occur,  but  not  necessarily  the  ciusiest 
to  correct. 

Type  1  errors--Type-l  errors  occur  when  a  program  runs  to  conclusion  to 
produce  the  requested  output  which  contains  obviously  incorrect  results.  A  fairly 
common  example  is  that  of  obtaining  a  negative  input  resistance  for  an  antenna. 

Type  2  errors-Type-2  errors  arise  when  the  program  runs  and  produces 
what  appear  to  be  physically  plausible  results,  but  which  are  invalid  for  the  problem 
being  modeled.  This  category  of  error  is  generally  most  insidious  because  it  is 
generally  the  most  difficult  to  identify  tmd  correct. 

Type  3  errors-Type-3  errors  are  user  dependent  as  they  occur  when  the 
modeler  misinterprets,  mistrusLs,  or  misuses  the  results  produced  in  the 
computation.  It  is  reasonably  w'cll  accepted  for  example,  that  computer  models 
produce  results  that  are  generally  more  accurate  on  a  relative  than  on  an  absolute 
basis.  For  example,  often  the  nulls  and  peaks  of  a  radiation  pattern  or  a  transfer 
function  are  shifted  between  compulation  and  measurement,  althimgh  their  overall 
structures  may  be  essentially  the  same.  A  modeler  unacquainted  with  such  shifts 
might  consequently  not  accept  computed  results  which  exhibit  them,  even  though 
they  arc  basically  correct  and  useful  for  the  problem  under  consideration. 


Sources  of  Errors--The  eiTor  source  defines  the  cause  of  error  as  arising  in  at  least  lour  ways: 


Software  errors--Thcse  can  originate  from  the  operating-system  sollware  or 
programming  errors  in  the  modeling  code. 

Numerical  modeling  errors— A  numerical  modeling  error  arises  Irom 
obtaining  insufficiently  accurate  numerical  results  for  the  model  that  has  been 
selected,  one  example  being  non-converged  results.  Another  example  is  that  ol 
using  word  si/cs  of  insufficient  hit  length  which  atlect  matrix-fill  and  matrix- 
solution  accuracy  due  either  to  machine  limitations  or  user  preference. 

Physical  modeling  errors-A  physical  modeling  error  arises  from  an 
inadequate  "match"  between  the  physical  reality  of  interest  and  the  numerical  model 
that  has  been  used.  A  common  example  in  antenna  modeling  is  that  of  improperly 
representing  the  source  region  in  the  numerical  model,  giving  ri.se  to  an  error  which 
primarily  affects  the  input  su.sceptance. 

User  errors-- Aside  from  such  obvious  sources  as  input-data  errors,  this 
catcgoi'y  includes  misapplication  of  the  model  by  violating  staled  limitaUons  intrinsic 
to  the  formulation  or  its  numerical  implementation.  An  example  t)l  the  former  is  use 
of  a  model  based  on  the  magnetic-field  integral  equation  for  open  or  thin  structures. 
Violation  of  the  thin-wire  approximation  by  using  .segment  lengths  shorter  than  the 
wire  diameter  is  one  example  of  the  latter.  User  errors  can  occur  because  the  model 
results  are  misinterpreted,  mistrusted  or  misused  due  to  unrealistic  expectations, 
unwarranted  skepticism  or  blind  faith.  The  two  extremes  of  user  reaction  that  can 
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follow  are  rejecLion  of  correct  results  or  acceptance  of  wrong  results,  either  of  which 
might  be  equally  unfortunate. 


Accuracy  Metrics  or  Measures--Application-rclcvanl,  model-independent  measures  must  be 
developed  for  quantitatively  assessing  the  accuracy  of  computed  results.  As  a  starting  point,  wc  might 
consider  the  following  categories  of  results; 

1)  Far-field  quaniities-For  exterior  problems,  far-field  quantities  and  results 
derived  therefrom  are  often  the  primary  goal  of  the  model  application.  These 
include  macroscopic  or  integral  measures  such  as  total  far-field  power  as  well  as  the 
microscopic  or  angle-dependent  resulLs  from  which  these  quantities  are  derived.  In 
those  cases  where  the  far-field  polarization  properties  are  important,  these  quantities 
are  needed  separately  for  the  appropriate  field  components.  Because  the  E  imd  H 
fields  are  related  simply  by  the  medium  wave  impedance,  only  one  ot  them  must  be 
detilt  with  explicitly. 

2)  Nctir-field  qutmtities-Thc  near  fields  are  generally  thought  tti  provide  a 
more  demtmding  measure  of  model  performance  than  does  the  far  Held.  Bccau.se 
the  E  and  H  fields  are,  in  the  near  field,  related  by  a  position-dependent  impedance, 
both  are  relevant  quantities  for  validation  purposes.  We  note  that  for  interior 
problems,  all  fields,  by  definition,  are  near-field  quantities  in  some  sense. 

3)  Boundtiry  quantities— These  are  quantities  associated  with  steps  in  medium 
properties  at  surfaces  on  which  boundary  conditions  are  stated  as  part  of  the 
problem  definition.  Both  the  fields  themselves,  as  well  as  their  associated  sources, 
arc  boundary  quantities  that  are  useful  tor  validation.  Derived  quantities  such  ;ls 
antenna  input  admittance  also  fall  into  this  category. 

4)  Other  quantities-There  arc  other  numerical  quantities  provided  by  many 
models  for  which  no  direct  physical  measurement  can  be  made  but  which 
nonetheless  arc  useful  for  assessing  compulation  accuracy.  For  example, 
convergence  of  the  eigen  value  (EV)  spectrum  as  the  number  of  unknowns  is 
increased  in  an  IE- based  model  serves  as  one  measure  of  convergence  and  by 
inference,  of  accuracy.  But  the  EV  would  not  be  directly  measurable,  although  it 
might  be  obtained  from  compulation  based  on  the  appropriate  physical 
measurement. 

5)  Approximations- Affecting  the  accuracy  of  all  the  above  measures  arc  the 
approximations  inherent  in  any  model  whether  made  in  the  formulation  or 
subsequent  numerical  implemcnuition.  These  approximations  affect  both  the  kinds 
of  problems  for  which  the  code  can  be  used  and  the  level  of  accuracy  that  might 
reasonably  be  expected  from  it,  and  can  also  dictate  how  the  code  might  be  used  tor 
achieving  increased  accuracy  for  problems  tliat  stretch  its  capabilities.  For  example, 
a  model  that  ctm  be  applied  to  geometries  having  edges  or  bends  hut  which  provides 
no  special  treatment  for  such  features  might  yield  better  results  w'hen  the  sampling 
density  is  systematically  increased  in  such  regions. 


CONCLUDING  COMMENTS 

The  problem  of  establishing  the  accuracy  of  model  results,  or  model  validation,  can  only  grow  in 
importance  as  the  size  and  complexity  of  the  problems  being  modeled  and  the  u.ses  to  which  they  are  put 
continues  to  expand.  Unfortunately,  qunalilalive  validaiton  of  specific  results  or  ot  the  models  used  to 
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produce  them  has  not  kept  pace  with  the  evolving  status  of  CEM  as  an  EM  susdiscipline.  While 
admittedly  a  difficult  problem,  model  validation  is  at  present  defficient  and  thus  impedes  all  other  aspecLs 
of  model  development  and  application.  Rather  than  continuing  the  status  quo  it  is  proposed  that  a  modest 
first  step  towards  a  validation  ethic  be  tiiken  by  requiring  uniform  statcmeiiLs  concerning  the  accuracy  or 
errors  of  such  results  to  be  included  in  all  future  published  material  in  the  reviewed  literature.  It  is 
essential  to  the  continued  development  of  CEM  that  developers  and  users  alike  are  aware  of  the 
limitations  of  computed  results  and  how  they  can  be  appropriately  interpreted  with  respect  to  intended 
applications. 
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1  Introduction 

riie  finite  element  metlioci  (FEM)  is  attractive  for  modeling  three  dimen¬ 
sional  problems  because  of  its  0{N)  memory  reciuirement  and  its  flexibility- 
in  geometry  design  and  modification.  The  0{N)  memory  feature  provid('S 
favorable  scaling  i)roperties  as  the  problem  size  increases.  Its  geometrical 
adaptability  jnovides  the  versatility  required  for  designing  complex  systems. 
Thus,  FEM  has  become  the  method  of  choice  for  electromagnetics  Ch'XD  soft¬ 
ware.  and  its  ai)plications  continue  to  incr<'a.se. 

In  spite  of  the  obvious  attractions  of  FEM  for  general  purpose  3D  eleo 
troiiiagnetic  field  solvers,  it  has  a  few  drawl)acks.  Since  the  method  was 
initially  used  for  solving  bounded  problems,  its  extension  to  open  problems 
i.s  not  easv.  In  o])en  problems,  we  are  interested  in  the  behavior  of  the  fields 
infinitely  far  away  from  the  structure  of  interest.  However,  it  is  impractical 
to  extend  the  finite  element  mesh  very  far  from  the  scattering  or  radiatiTig 
structure.  The  normal  practice  is  to  extend  the  mesh  a  few  element  lengths 
from  the  body  and  apply  boundary  conditions  on  the  mesh  termination  sur¬ 
face.  These  boundary  conditions,  which  are  local  to  the  element  and  hence 
preserve  system  sparsity,  are  called  absorbing  boundary  conditions( ABds). 
.Although,  numerous  AfKJs  exist  for  2D  problems  [1,  2],  ABCs  for  3D  vec¬ 
tor  ]:)roblems  are  comparatively  fewer.  Peterson  [3]  derived  vector  A  BCE  for 
splierical  mesh  truncations;  however,  in  most  practical  cases,  the  sphere  is  the 
least  economical  shape  of  mesh  truncation  in  terms  of  computer  resources, 
ft!  an  earlier  paper  [4],  we  derived  ABCE  which  can  be  enforced  on  surfaces 
conformal  to  the  structure  of  intere.st,  thus  optimizing  computational  cost. 
Since  that  time,  we  have  implemented  these  .AB(E  in  a  general  pur[)ose,  3D 
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finite  element  solver  with  success  [5].  In  this  paper,  we  present  a  few  more 
results  which  demonstrate  that  these  ABCs  indeed  optimize  the  usage  of 
computational  resources  without  significant  degradation  in  accuracy. 

Besides  the  optimization  of  the  mesh  truncation  strategy,  we  carried  out 
optimizations  on  the  numerical  aspects  of  the  code.  Since  a  finite  element 
code  involves  operations  on  sparse  matrices,  indirect  addressing  is  a  necessary 
part  of  the  programming  task.  This  feature  combined  with  very  short  vector 
lengths  for  the  sparse  matrix  result  in  poor  vectorization  and  parallelization. 
Essentially,  this  is  the  price  paid  for  0{N)  storage  and  improved  scalabil¬ 
ity  of  the  techiiicpie.  In  an  earlier  paper  [fi],  we  had  detailed  our  efforts  in 
parallelizing  such  a  code  on  various  distributed  memory,  multiprocessor  ar¬ 
chitectures.  The  parallelization  strategies  that  we  had  used  were  extremely 
successful;  the  code,  however,  ran  very  slowly  on  vector  machines.  In  this  pa¬ 
per,  we  employ  a  novel  data  storage  scheme  for  speeding  up  the  computation 
on  vector  architectures  and  present  a  strategy  for  reducing  inter-processor 
communication  on  multiprocessor  machines. 


2  Conformal  ABCs 

2.1  Theory 

In  this  section,  we  will  present  a  brief  description  of  the  conformal  ABCs 
and  examine  their  performance  with  respect  to  spherical  ABCs.  At  first,  we 
generalize  the  Wilcox  expansion  [7]  for  a  vector  field  in  the  Dupin  coordinate 
.system.  Next,  we  apply  the  fi  x  Vx  operator  to  the  electric  field  (E)  to 
arrive  at  the  first  order  absorbing  boundary  condition 

n  X  VxE- O'A-, +  -77-)Ei  ^0  (1) 

for  a  conformal  mesh  termination  boundary.  In  this  expression,  A\,  is  the  free 
space  wave  number,  the  subscript  t  denotes  the  tangential  component  of  a 
vector  and 


-f  K2 


T]  -  -f  K2i2t2 

where  Ki^2  fhe  two  principal  curvatures  of  the  ABC  surface. 

The  second  order  ABC  is  obtained  by  using  the  n  x  Vx  operator  once 
more  and  subsequently  simplifying  the  resulting  expression  to  yield 

-  -  2«:„)  n  X  VxE  +  {4k;,  -  k,  +  D  {jk„  -  fj-)  +  {rjf  ■  +k,„AkC}  E, 

+Vx  {n  (VxE|„)  +  (jh  +  3k„,  -  —  -  V,E„  =  0  (2) 
V  Xm  ^ 

where  k.^  =  k\K2  is  the  Gaussian  curvature,  the  subscript  n  denotes  the 
normal  component  of  a  vector  and 

D  =  +  5k„i  — ^ 

Kjji 
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(3) 


Ak  —  /VI  —  K2 

C  ==  titi  —  t2t2 

Also,  ti,t2  arc  the  orthogonal  tangential  unit  vectors  in  t!ie  Dnpin  coordinate 
system.  The  finite  element  implementation  becomes  simpler,  and  in  some 
cases  symmetric,  if  the  term  -  VfT'n  -  can  be  replaced  by  a  dotd)le  derivative. 
Fortunately,  on  considering  the  scries  expansion  of  the  term  ii  x  VxVi/f,, 
and  simplifying,  we  have 


V,(V-E,)  (4) 

'rims  we  can  rewrite  the  second  order  conformal  ABC  as 
( D  —  2a'm)  n  X  V xE  =  —  Kg  +  D  {jko  —  v)  +  i'n)^ '  A/t(^-|  E;  + 

ex  {n  (VxE)J  +  ~  (it.  +  3k„,  -  ^  -  2ij  )  V,  (V-E,)  (5) 

To  make  (1 )  and  (5)  implementable  in  finite  element  systems,  the  results 
are  simplified  by  taking  the  dot  product  of  the  expression  with  E,  using  the 
divergence  condition,  the  vector  wave  ecpiation  and  some  vector  identities 
[5].  The  first  order  conformal  AI4C  in  readily  implementable  form  is  given 

i>.y 

I  E  •  yhlE)  (IS  =  {jk()  +  jg  ““  jg  (^S 

(6) 


where  5,,  is  the  mesh  truncation  surface  and  /’i(E)  =  n  x  VxE  with  the 
subscript  denoting  the  order  of  the  ABC. 

The  second  order  ABC  reduces  to 


E  ■  a(E)  dS  =  ^  {a, El  +  a,El)  dS  +  //(VxE)^„  dS 

-J^  (V-E,){V.(7.E),}  dS 
where  the  tensors  oF,  7  and  the  scalar  3  are  given  by 

^  ~  D  _  9^-  -  Kg  D  {jk'o  —  )  +  a‘i}  titi 

+  —  Kg  3-  D  {jk'o  ~  K2)  +  ^2^2] 

(^jk'o  +  3a„,  —  —  2k titi 


3  = 


1 

jko{D  -  2k 
T  T  3k„i  — 

i 


-  2a2)  ^2^2 


(8) 
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a/X*.  in  dB 


It  should  be  remarked  that  the  normal  component  of  each  surface  edge  must 
be  made  continuous  across  inter-element  boundaries  (triangular  patches  in 
our  case)  for  the  contour  integral  associated  with  the  third  term  in  (7)  to 
vanish.  Moreover,  it  can  be  shown  that  the  first  order  ABC  (6)  is  always 
symmetric  whereas  the  second  order  ABC  (7)  is  symmetric  only  when  K]  =  K2 
on  the  boundary  surface  or  when  the  surface  is  cylindrical  and  linear  edge 
bases  are  employed.  For  a  detailed  analysis  of  symmetry  considerations,  the 
reader  is  referred  to  [5]. 

2.2  Results 

In  this  .section,  we  present  validations  for  the  conformal  .ABCs  derived  in  the 
previous  section.  A  complete  description  of  the  numerous  geometries  that 
were  validated  using  these  ABCs  can  be  found  in  [5]. 


Observalion  Angle  0..  deg.  Observation  Angle  0..  deg. 

Figure  1;  Backscatter  pattern  of  a  perfectly  conducting  conesphere  (cone 
height=4A  ;  radius  =  0.5A)  for  and  99  polarizations.  Black  dots  indicate 
computed  values  and  the  solid  line  represents  data  from  a  body  ol  revolution 
code  [11].  Mesh  termination  surface  is  a  rectangular  box. 

The  geometry  for  which  the  RCS  results  were  computed  is  unique  in  its 
own  way.  A  conesphere  is  basically  a  hemisphere  attached  to  a  cone.  It  is  a 
difficult  geometry  to  mesh  since  a  surface  singularity  exists  at  the  tip  of  the 
cone.  The  singularity  can  be  removed  in  two  ways:  i)  by  creating  a  small 
region  near  the  tip  and  detaching  it  from  the  surface  or  ii)  by  chopping  off 
a  small  part  near  the  tip  of  the  cone.  The  second  option  inevitably  leads 
to  small  inaccuracies  for  backscatter  from  the  conical  tip;  however,  we  chose 
this  option  since  the  conical  angle  in  our  tested  geometry  was  extremely 
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small  (arouiifl  7"')  and  the  mesh  generator  failed  to  mesh  the  first  case  on 
numerous  occasions.  In  Figure  1,  we  plot  the  backscatt.ei'  patterns  of  a  4.5A 
long  conesj)here  having  a  radius  of  0.5A  for  00  and  polarizations.  The  mesh 
truncation  surface  is  a  rectangular  box  placed  0.4A  from  the  surface  of  the 
coiicsphcre.  The  far-field  results  compare  extremely  well  with  computations 
from  a  body  of  revolution  code  [if]. 


3  Vector ization/parallelizat ion  strategies 

Since  our  focus  i.s  on  solving  large  problems,  the  code  must  be  optimized  to 
run  fast  on  vector  and  parallel  architectures.  In  [6],  we  detailed  our  par¬ 
allelization  strategy  and  presented  results  on  speedup  and  inter-processor 
communication.  However,  the  performance  of  the  code  on  vector  processors 
was  not  very  encouraging.  In  the  subsequetit  sections,  we  outline  our  opti¬ 
mization  scheme  for  vector  computers  and  present  technicpies  for  reducing 
inter-processor  communication  on  distributed  memory  architectures. 


3.1  Vector  optimization 

Since  a  sparse  matrix  has  a  very  small  number  of  non-zeros  per  row  bv 
definition  and  only  the  inneinnost  loops  are  veclorizable,  it  is  difficult  to 
g('t  good  vector  perlormance  from  such  codes.  Further,  indirect  addressing 
is  an  inherent  part  of  sparse  data  structures  -  a  feature  which  allows  us  to 
exploit  lh('  0{  N)  storage  characteristic  but  reduces  speed  on  vector  machines. 
Therefore,  there  are  two  main  problems  which  linut  the  vectorizability  of  a 
sparse  matrix  code  -  short  vector  lengths  and  indirect  addressing.  I'he  latter 
problem  cannot  be  corrected  but  the  first  bottleneck  can  be  removed.  This 
is  done  by  storing  the  matrix  in  a  different  format  such  that  the  vector 
lengths  are  approximately  equal  to  the  order  of  the  system  being  solved. 
In  the  traditional  storage  .system  -  Compressed  Sparse  Row  ((iSH)  format 
-  the  non-zeros  of  the  matrix  and  their  corresponding  column  numbers  arc 
stored  in  a  long  complex  and  integer  vector,  respectively,  with  another  short 
integer  vector  to  store  the  number  of  non-zeros  per  row.  However,  this  does 
not  permit  vectorizability  since  the  average  vector  length  is  very  small  -  16 
in  our  case.  The  ITPACK  format  [12]  alleviates  the  short  vector  length 
problem  by  storing  the  entire  matrix  in  a  rectangular  block.  In  this  block, 
the  number  of  rows  equals  the  row  count  of  the  original  matrix  and  the 
number  of  columns  ecpials  the  maximum  nuniber  of  non-zeros  in  a  row  of 
the  matrix:  rows  containing  fewer  non-zero  elements  are  padded  with  zeros. 
This  scheme  works  very  well  for  matrices  where  the  average  non-zeros  per 
row  aie  approximately  the  same.  In  our  case,  this  storage  technique  is  not 
very  benefcial  since  approximately  30%  of  the  .space  i.s  lost  in  zero  padding. 

The  storage  format  that  works  best  for  our  type  of  matrix  is  called  the 
jagged  diagonal  format  [13].  The  rows  are  ordered  in  decreasing  order  of  the 
nund:)er  of  non-zeros  per  row.  The  rows  containing  the  maximum  number  of 
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non-zero  entries  are  thus  placed  at  the  top  of  the  matrix  and  the  rows  with 
the  minimum  non-zero  entries  are  shuffled  to  the  bottom.  In  the  actual  stor¬ 
age  scheme,  the  leftmost  elements  of  each  row  are  stored  as  a  dense  vector 
with  an  additional  vector  indicating  the  column  numbers  of  each  element. 
The  matrix  is  thus  stored  as  a  collection  of  vectors  of  decreasing  length.  The 
inner  loop  of  the  matrix-vector  multiplication  routine  traverses  the  entire 
length  of  a  jagged  diagonal,  which  can  be  of  the  order  of  the  system  being 
solved.  This  feature  greatly  enhances  vectorization.  The  storage  requirement 
of  the  above  format  can  be  made  to  be  the  same  as  the  previously  mentioned 
CSR  format  through  careful  programming.  The  altered  matrix-vector  multi¬ 
plication  routine  then  runs  at  around  275  Mflops  on  a  Cray  C-90  whereas  the 
older  code  with  CSR  storage  peaked  at  CO  Mflops.  The  dot  product  reaches 
speeds  of  550  Mflops  and  the  vector  updates  execute  at  600  Mflops.  It  must 
be  mentioned  that  the  CRAY  C-90  is  a  substantially  faster  machine  than  Ihe 
Cray  YMF  but  the  CSR  formatted  matrix-vector  multiplication  routine  runs 
about  4  times  slower  on  the  C-90.  Therefore,  we  can  reliably  state  that  the 
method  oi  jagged  diagonals  is  the  best  sparse  matrix  storage  scheme  in  terms 
of  computer  storage  and  vectorizability.  The  still  slower  execution  speeds  of 
the  matrix-vector  multiply  compared  with  the  vector  update  is  due  to  the 
indirect  addressing  in  the  inner  loop  which  causes  memory  contention. 

3.2  Reduction  of  processor  communication 


(a)  (b) 


Figure  2:  (a)  Original  sparse  matrix  (b)  Re-ordered  sparse  matrix  using  a 
profile  reduction  algorithm. 

In  the  i)revious  section,  we  discussed  optimization  from  the  viewpoint  of  a 
vector  processor.  In  this  section,  we  propose  a  scheme  for  reducing  inter¬ 
processor  communication  on  multi-processor  architectures.  In  [6],  we  out- 
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lined  our  success  in  parallelizing  the  computationally  intensive  portions  of  a 
finite  element  code  on  distributed  memory  architectures.  It  was  also  pointed 
out  that  further  speedups  could  be  achieved  only  through  reducing  data  com¬ 
munication  among  t  he  various  processors. 

The  majority  of  processor  time  in  a  finite  element  code  is  spent  in  the 
ecpiation  solver.  In  our  code,  we  employ  an  iterative  solver  since  it  preserves 
the  sparsity  of  the  finite  element  matrix,  making  minimal  demands  on  com¬ 
puter  storage.  In  the  biconjugate  gradient  algorithm,  there  are  principally 
two  operations  in  which  the  most  intensive  communication  takes  place.  The 
first  is  the  sparse  matrix-vector  multiplication  and  the  second  is  the  search 
vector  update  at  the  end  of  each  iteration. 

In  the  matrix-vector  multiply,  each  processor  computes  a  block  of  the 
result  vector  by  multiplying  the  corresponding  block  of  rows  of  the  sparse 
matrix  with  the  operand  vector.  Since  the  operand  vector  is  distributed 
among  the  processors,  data  communication  is  recpiired.  Kach  processor  does 
two  things:  (i)  it  sends  out  a  request  for  those  matrix  entries  it  does  not 
own  hut  needs  for  performing  the  multiplication  and  (ii)  sends  out  those 
matrix  entries  it  owns  on  request  from  other  processors.  The  communica¬ 
tion  pattern  is  determined  by  the  sparsity  structure  of  the  matrix,  which 
in  our  case  is  derived  from  an  unstructured  mesh.  Therefore  the  commu¬ 
nication  pattern  is  unstructured  and  irregular.  However,  on  reordering  the 
matrix  using  a  st.andard  profile  reduction  algorithm  (part  b  of  Figure  2),  the 
matrix  becomes  banded  and  t  he  only  communication  should  occur  between 
adjacent  processors.  In  fact,  by  storing  a  few  extra  matrix  entries  in  each 
[)roces.sor,  inter-processor  communication  can  be  removed  altogether  in  the 
matrix- vector  multiplication  phase.  However,  the  time  taken  for  communi¬ 
cation  due  to  the  vector  update  still  remains  the  same. 


4  Conclusion 

In  this  paper,  we  have  talked  about  the  optimization  strategies  that  were 
employed  to  improve  our  finite  element  code  from  the  algorithmic  and  the 
numerical  point  of  view.  VVe  have  achieved  notable  success  on  both  fronts. 

The  problem  size  was  reduced  by  a  significant  amount  owing  to  the  use  of 
conformal  .'\BCs  and  the  savings  are  only  going  to  increase  as  the  problem  size 
gets  larger.  Higher  order  ABCs  may  enable  us  to  bring  the  mesh  termination 
surface  even  closer  to  the  target,  enabling  us  to  do  larger  problems  w^ith 
the  available  computer  resources.  The  numerical  aspect  is  also  important 
for  addressing  the  utility  and  the  feasibility  issues  for  solving  practical  3D 
])roblems  in  a  reasonable  amount  of  time.  As  processor  speeds  increase  and 
parallel  architectures  mature,  the  speed  and  performance  of  the  code  will 
vastly  improve.  Therefore,  issues  concerning  the  performance  of  the  finite 
element-ABC  technique  and  its  irnplementability  in  large  scale  computer 
simulations  will  continue  to  be  important  in  the  years  to  come. 
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Abstract 

Tliis  paper  discusses  tlic  fundamental  formulations  for  solving  the  three  dimensional 
Maxwell’s  equations  using  charactouistic-based  finite  volume  time  domain  method.  The 
characteristic-based  “non-reflection”  boundary  condition  at  the  outer  truncation  boundary 
and  the  boundary  condition  on  the  surface  of  a  scatterer  are  also  presented.  Numerical  results 
(■alculated  using  this  method  for  radiation  from  a  dipole  and  scattering  from  a  perfetd.ly 
conducting  sphere  are  compared  to  the  theoretical  solutions. 


1  Introduction 

Methods  ba.sed  on  solving  Maxwell’s  equations  in  the  partial  differential  (!quation  (PDE)  form 
have  emerged  as  viable  techniques  in  recent  years.  The  increased  popularity  of  the  PDE  methods 
is  due  to  their  inherent  capability  in  treating  penetrable  materials  comi)ared  to  other  techniciues 
ba.sed  on  the  integral  equation  (IE)  form  and  the  asymptotic  ai)proximatinn.  The  PDE  mc'thods, 
especially  those  employing  explicit  schemes,  also  have  an  advantage  over  IE  mctliods  in  terms  of 
computer  memory  requirement.  Of  the  PDE  methods  known  to  EM  engineers  and  rescarcliers, 
the  finite  difference  time  domain  (FDTD)  method  developed  l)y  Yee  in  1966  is  the  most  widely 
used  method  due  to  its  simplicity.  However,  Yee’s  FDTD  suffers  stair  casing  error  if  the  boundary 
of  the  physical  geometry  deviates  from  the  uniform  computational  grids.  The  other  widely  used 
PDE  method  is  based  on  the  finite  element  method  (FEM)  in  which  the  geometry  is  discretized 
into  elements  that  are  conformal  to  the  physical  geometry.  Unfortunately,  finite  element  method 
requires  much  more  computer  memory  for  storing  those  nnstructured  grids.  In  addition,  the 
accuracy  of  FEM  solutions  usually  dei)cnds  on  the  quality  of  the  grids.  This  paper  introduces  the 
EM  researchers  to  a  structured  grid  characteristic-based  finite  volume  time  rlomain  method  [1,2,3] 
to  solv('  the  3D  time  domain  Maxwell  ecjnations.  This  characteristic-based  technique  has  been 
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widely  used  in  the  computational  fluid  dynamics  (CFD)  community  to  solve  hyperbolic  partial 
differential  equation  systems.  Recognizing  that  Maxwell’s  equations  in  time  domain  PDE  form 
constitute  a  hyperbolic  PDE  system,  most  of  the  algorithms  developed  for  the  CFD  applications 
can  be  readily  applied  to  the  computational  electromagnetics  (CEM)  problems.  This  technique 
appears  to  offer  certain  advantages  over  the  FDTD  and  FEM  methods  because  it  can  solve  the 
problem  conformally  like  the  FEM  and  yet  efficiently  like  the  FDTD  due  to  the  structured  grid 
used.  The  characteristic  “non- reflection”  boundary  condition  has  similar  accuracy  compared  to  the 
absorbing  boundary  condition  such  as  Mur’s,  but  is  computationally  more  efficient  to  implement 
it.  This  paper  will  discuss  the  fundamental  formulations  of  the  characteristic-based  finite  volume 
time  domain  method,  the  “non-reflection”  boundary  condition  at  the  outer  truncation  boundary, 
the  boundary  condition  on  the  surface  of  the  scatterer,  and  the  field  update  and  time-stepping 
procedure.  Some  mimerical  results  will  be  presented  to  demonstrate  the  capability  of  this  method. 


2  Formulations 


Th(’  time  domain  Maxwell’s  equations  are  given  by 

^  +  =  0;  vR  =  b 

ot 

OD  TT  T  T-i 

—  -S7  X  H  =  -  J  ;  S7  ■  D  ^  Qy 


(1) 

(2) 


where  E  is  the  electric  intensity;  H  is  the  magnetic  intensity;  D  is  the  electric  flux  density;  B  is 
the  magnetic  flux  density;  J  is  the  electric  current  density;  and  is  the  electric  charge  density. 
J  and  (]y  are  related  through  the  continuity  equation  given  by 


_  T  *^9*' 


(3) 


In  simple  linear  matter,  B  and  D  are  related  to  E  and  H  by  the  constitutive  relationships: 

D  =  tE  \  B  =  ^iH  (4) 


where  c  and  fi  are  tlie  permittivity  and  permeability  of  the  media,  respectively.  Maxwell  s 
equations  can  be  rewritten  in  flux-vector  form  given  by 


dU  dF 

dG  dH 

-f-  — —  -|-  ' 

=  S 

(5) 

dt  dx 

dy  dz 

where 

U  = 

[b,.  By  B. 

D,.  Dy  D, 

]D  s  = 

[  0  0  0  -Jx 

-h,  -X  ; 

(6) 

F  = 

[  0  -  E,  Ey 

0  Ih  -By 

]D  G  = 

[  E,  0  -Ey 

-H,  0  By  ; 

(7) 

H  = 

[  -Ey  E,.  0 

By  -Hy  0 

lb 

(8) 

T  stands  for  the  transpose  of  a  matrix.  From  the  constitutive  relationships,  F,  G,  and  H  are 
related  to  U  by 


F  =  AIJ,  G  =  BU,  and  H  =  CU  (9) 
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where 

■  0  0  0  0  0  0  1  Too  0  0  0  f  1  [  0  0  0  0  0  ■ 

0  0000  -i  00  0000  0  oof  00 

0  0  0  0  f  0  0  0  0  -f  0  0  0  0  0  0  0  0 

0  0  0  0  0  0  0  0  0  0  0’^“  0^  0  0  0  0 

0  0  f  0  0  0  000000  -^000  0  0 

0  _1  0  0  0  0  i  0  0  0  0  0  0  0  0  0  0  0 

I  II-  J  I  II  J  L  J 

Bv  solving  for  the  eigenvalues  and  eigenvectors  of  A,  A  can  be  exi)ressed  as 

A  =  S^AS;'  (11) 

where  A  is  the  diagonal  matrix  given  by 


■  Ai  0  0  0  0  0  ■ 

0  A2  0  0  0  0 

0  0  A3  0  0  0 

0  0  0  A,i  0  0 

0  0  0  0  Ar,  0 

_  0  0  0  0  0  A6  . 


(12) 


A],  A2,  A3,  A,].  A5,  and  An  are  the  eigenvalues  of  A  given  by  A  = 
is  the  diagonalizing  matrix  given  by 

■  0  0  0  0  0  1' 

0  000 

s.r  -  °  0  . 

0  0  0  010’ 

0  1  0  100 

1  0  1  000 

and  is  the  inverse  of  S^-.  Similarly,  B  and  C  can  be  expressed  as  B  =  S,/AS~'  and  C  =  S^AS,  ' 
with 
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In  order  to  achieve  the  approximate  Riemann  solver  and  apply  the  non-reflection  boundary 
condition  at  the  truncation  boundary,  F  needs  to  be  split  into  and  F"  [4]  with  F+  and  F" 
(•orrespondiiig  to  the  outgoing  and  incoming  waves,  respectively.  To  accomplish  this,  A  will  bo 
split,  into  and  A'  with  A"^  containing  only  the  positive  eigenvalues  (^,  and  A“  (X)ntaining 

only  the  negative  eigenvalues  (-;^)  Thus,  one  has 

F  =  F+  -f  F-  =  S,,A-'  S;^U  +  S^A-S,t'U  (15) 
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Depending  on  tlie  accuracy  desired,  equation  (22)  can  be  discretized  in  many  different  ways. 
Let  t  —  nAt  denote  the  rrth  time  level  and  denote  the  {x,ij,a:)  coordinate  of  a  point  at 

X  =  zA.r.  y  =  jAy,  and  2  =  kAz,  U  is  defined  at  the  centroid  of  the  finite  volume  cell  with 
index  (pj,  A')  while  F,  G,  and  H  are  defined  on  the  surfaces  of  the  cell  with  indices  [i  +  5,^,  A), 
{i,  j  +  |,  A),  and  (A  j,  A  +  i),  respectively.  The  field  update  procedure  starts  with  calculating  U  at 
the  cell  surfaces  by  interpretating  the  values  of  U  at  the  appropriate  cell  centroids.  The  following 
formulations  give  the  values  of  and  using  the  upwind  biased  algorithm  [5]: 

Uf+i  =  U,  +  ^[(1-k)  V +  (!  +  ,.)  A]  U,  (23) 

i  ^  [(1  +  «;)V +  (!-«)  A]  (24) 

w'here  vU?  —  U,  -  Ui_i  and  AU,  =  11^.,  1  -  Uj;  </>  equals  zero  for  first  order  accuracy  scheme  and 
one  for  higher  order  accuracy  schemes,  k  is  set  to  a  value  to  give  the  desired  order  of  accuracy. 
For  instance,  when  k  =  0,  the  above  equations  yield  a  second  order  accurate  scheme  which  is  used 
to  give  the  results  reported  in  this  paper.  From  the  values  of  and  the  fluxes  can  be 

reconstructed  by 


hi  = 


,1  =  G-'(UjvG  +  G-(Uf,  . )  ;  +  H-(U^  . ).  (25) 


Once  F,  G,  and  H  are  obtained  on  the  cell  surfaces,  they  are  used  to  calculate  AU  in  equation  (22). 
To  update  U  in  time,  a  two-stage  Runge-Kutta  second  order  accurate  scheme  is  used  to  produce 
the  results  reported  here: 


Uo  -  U„ 

Ui  =  Uo  -  AU(Uo) 

U:^  =  Un  -  0.5(AU(Ui)  +  AU(Uo)) 
U„+,  =  U2 
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The  source  function  is  applied  as  the  initial  boundary  value  at  the  location  where  the  source 
exists.  The  non-reflection  boundary  condition  is  applied  by  setting  F~,  G“,  and  H"  equal  to  zero 
at  the  outer  truncation  bouinlary  to  suppress  the  wave  coming  into  the  computationa!  domain. 
The  non-reflection  boundary  condition  is  exact  for  one  dimensional  problems  since  the  direction 
of  the  wave  propagation  is  always  aligned  with  the  coordinate.  However,  it  becomes  a])proximate 
for  two  and  three  dimensional  problems  when  the  wave  propagation  direction  is  no  longer  aligned 
with  the  coordinates. 

Ecpiation  (5)  is  useful  when  the  scatterer  in  the  computational  domain  can  be  discretized 
into  uniform  grids.  But  in  most  practical  applications,  a  curvilinear  grid  which  conforms  to  the 
surface  of  the  scatterer  is  retiuired.  Equation  (5)  can  be  mapped  to  a  curvilinear  coordinate  system 
defined  by 


and  rewritten  as 

^  ^  ^  ^  _  c 

dt  ^  drj  ^  dC, 

where 

U  =  ^U;  F=  ^(G;F-e(,G+GH)  ; 

G  =  ^  (i;,F  +  ,,,G  +  r,,H)  ;  H  =  (GF  +  C„G  +  C-H)  ;  S  =  j;^.S  . 

.1  is  the  Jacobian  of  the  coordinate  transform  given  by 


(27) 


(28) 


(29) 


Cx  '^Ix  Cr 

ny  Cy 
G  Vz  C' 


(JO) 


and  |J|  is  its  determinant. 

Again,  the  continuous  derivatives  in  Equation  (28)  can  be  approximated  by  discrete'  operators 
given  by 


AU  AF  AG  AF  _  - 
^  ^  A^  Ar?  ^  AC  ' 


(31) 


However,  if  one  attempts  to  find  the  split  fluxes  in  the  (^,  //,  C)  coordinate  system,  one  would  need 
to  rederive  the  eigenvalues  and  eigenvectors  in  the  (^,  ?pC)  coordinate  system.  Recognizing  that 
all  outward  normal  vectors  on  the  surfaces  of  each  structured  grid  cell  are  defined  by  |^,  and 
i^,  locally  orthogonal  coordinate  .systems  can  be  generated  on  the  constant  C-  V-.  and  (  surfaces, 
respectively.  For  instance,  the  locally  orthogonal  coordinate  system  for  constant  C  f'au  be  found 
by  defining  its  first  unit  vector  as 


- 

‘  IvG 


(32) 
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The  second  unit  vector  can  be  found  by  defining  an  arbitrary  vector  on  tin'  constant  (  surface 
and  taking  the  cross  product  of  and  u^i  to  give 


X 

\R(,  X  -u^il 


The  third  unit  vector  is  generated  by 


X  u^‘>  . 

Similarly,  the  locally  orthogonal  coordinate  systems  constant 

7]  and  C  planes  can  be  found  by  replacing  the  subscript  ^  in  equations  (32)  to  (34)  with  7)  and  ( 
respectively.  Because  ^^2/^43)  are  orthogonal,  71^  '  =  uj . 

Transforming  F  to  (n^i ,  'a^3)  coordinate  system,  one  has 


Notice  that  F  has  the  same  form  as  the  F  in  Cartesian  coordinate  system.  Thus,  the  eigenvalues 
and  diagonalizing  matrices  for  F  are  the  same  as  those  of  F.  In  other  words,  F  can  be  split  into 
F^  and  F~  in  the  same  way  as  F  in  the  Cartesian  coordinate  system.  Similarly,  G  and  H  can  be 
transformed  to  (a,,! ,  Ui,2?  W/p)  (wo  >  ^^2) '^"^<3)  coordinate  systems  to  split  the  fluxes.  Thus,  the 
field  update  procedure  for  the  curvilinear  coordinate  system  is  done  by  first  transforming  U  to  the 
{7\i,y^2,tR3)  coordinate  system  to  find  F^  and  F“,  then  transforming  F  back  to  F  to  calculate 
AF.  Similar  procedure  is  repeated  for  the  calculation  of  AG  and  AH.  Finally,  AU  is  found  from 
equation  (31).  The  time  integration  procedure  and  the  interpretation  procedure  for  finding  U 
and  on  the  cell  surfaces  are  the  same  as  in  equations  (23)  to  (26). 


3  Boundary  Conditions  on  the  Surface  of  the  Scatterer 

Maxwell’s  equations  in  integral  form  yield  the  following  boundary  conditions: 

h  X  -  E2)  =  0  ;  n  ■  (-^1  -  R2)  —  6  pg) 

h  X  {Hi  -  H2)  -  Js  \  •  (^1  ”  ^2)  =  Ps 

Since  there  are  a  total  of  six  field  components  for  the  electric  and  magnetic  fields,  equation  (36) 
only  provides  three  deterministic  equations  in  relating  the  field  components  across  the  surface 
of  the  scatterer;  namely,  the  two  tangential  components  of  the  electric  field  and  the  normal 
component  of  the  magnetic  field  are  continuous  across  the  surface  of  the  scatterer.  For  a  non- 
perfectly  conducting  scatterer,  the  surface  current  and  charge  densities,  ./,  and  p.,,  are  finite  and 
confined  to  the  surface.  Thus,  one  has 

fi  •  V(n  X  (//i  -  7/2))  -  0  ;  n- v(n  - (A  -^>2))  =  0  (37) 

which  provides  derivative  conditions  needed  to  completely  describe  the  behavior  of  the  electric 
and  magnetic  fields  in  the  norma!  direction.  In  the  case  of  a  perfectly  conducting  scatterer,  the 
surface  current  and  charge  densities  approach  infinity.  However,  they  can  be  defined  in  a  limiting 
sense  such  that  equation  (37)  still  applies. 
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4  Results 

Two  simple  cases  using  tlie  second  order  finite  volume  scheme  discussed  in  this  paper  will  i^e 
given.  The  first  case  is  to  find  tlie  radiated  field  from  a  short  dipole  in  free  space.  The  dipole  has 
a  current  source  equals  to  sin{27it).  Figure  1  shows  the  calculated  field  components  at  the  time 
when  the  initial  pulse  has  propagated  a  distance  of  2.248  wavelengths  away  from  the  dii)ole,  A 
(49  X  48  X  96)  grid  was  used  to  generate  the  results.  The  fields  shown  are  located  on  tlu'  constant 
y  =  20Ay  and  ^  =  48A2  plane.  The  residts  agree  very  well  with  the  theoretical  solutions. 

The  second  case  is  the  plane  wave  scattering  from  a  perfectly  conducting  sphere  with  ka  —  2.3, 
where  k  is  the  free  space  wave  number  and  a  is  the  radius  of  the  sphere.  A  {49  x  48  x  96)  grid 
was  used  to  generate  the  results.  The  field  solutions  in  the  time  domain  were  transformed  to  the 
frequency  domain  and  integrated  to  yield  the  radar  cross  section.  Two  sets  of  results  are  shown  in 
Figure  2,  one  after  the  plane  wave  has  gone  through  one  period  and  thc'  other  after  two  periods. 
Showing  the  results  obtained  at  different  time  levels  ensures  the  field  has  reached  a  time  harmonic 
steady  state.  Both  results  agree  well  with  those  obtained  using  Mie  series  solutions. 

5  Conclusions 

The  fundamental  formulations  for  the  characteristic-based  finite  volume  time  domain  method  has 
been  ])resented.  The  boundary  conditions  at  the  outer  truncation  boundary  and  on  tin'  surface 
of  the  scatterer  are  also  discussed.  Thc  numerical  results  for  a  radiating  dipole  and  perfectly 
conducting  sphere  agree  well  with  the  theoretical  solutions.  Although  the  initial  results  show 
great  potential  for  this  method,  continuing  study  is  required  to  fully  mature  this  technology. 
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Figure  1.  Fields  radiated  from  a  dipole 
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igure  2.  Radar  cross  section  of  a  conducting  sphere  with  ka  -  2.3 
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Abstract 

This  paper  presents  several  computational  techniques  for  handling  the  eddy  current  problems  in  electrical 
devices.  Both  sinusoidal  as  well  as  transient  variations  of  excitation  systems  in  linear  and  nonlinear  cases  are 
presented.  The  applications  utilize  the  finite  element  method  in  three  dimensions.  Examples  are  shown  using 
vector  and/or  scalar  formulations.  Finally  a  neural  network  model  is  suggested  to  implement  the  solution  of  the 
two  dimensional  eddy  current  problem  in  parallel  with  an  implementation  example. 

Introduction 

In  dealing  with  eddy  current  problems  involving  sinusoidal  variation,  the  idea  of  complex  magnetic  vector 
potential  is  used  to  formulate  and  solve  the  field  problem.  In  treating  the  nonlinear  transient  eddy  current  problem, 
a  method  which  utilize  the  magnetic  vector  potential  (A)  and  electric  scalar  potential  ((}))  which  is  known  as  the 
(A-(!))"  method  is  used  to  formulate  and  solve  the  field  problem.  The  time  derivative  in  this  class  of  problems  is 
treated  using  the  Crank-Nicolson,  State-Space  techniques  as  well  as  other  time  marching  schemes.  Another 
method  involving  the  electric  vector  potential  (T)  and  the  magnetic  scalar  potential  (Q)  which  is  known  as  the 
(T-Q)  method  is  al.so  used  to  formulate  and  solve  the  3D  transient  eddy  current  problem. 

A  technique  for  solving  the  electromagnetic  field  equations  and  calculating  the  eddy  current  density 
values  is  based  on  the  Iterative  Scalar  Potential  (ISP)  formulation.'^'  The  procedure  is  simple  and  has  a  major 
advantage  in  that  the  numerical  problem  is  solved  in  terms  of  electromagnetic  field  variables  which  has  a  physical 
meaning.  Furthermore,  in  this  method,  there  is  only  one  degree  of  freedom  per  node  which  results  in  sizable 
savings  in  computation  time.  The  numerical  results  of  this  new  technique  are  obtained  in  less  than  50%  of  the  time 
consumed  utilizing  other  methods. 

A  parallel  architecture  for  eddy  current  calculations  is  formed  with  a  neural  network  which  is  formed  by 
nodes  that  are  fully  interconnected.  Each  node  is  modeled  with  a  set  of  circuit  elements  as  determined  by  the 
governing  equation.  The  interconnection  between  the  nodes  is  produced  through  weights  similar  to  dependent 
sources  in  an  electrical  circuit  according  to  a  specific  rule.  The  directly  connected  nodes  could  interact  directly 
with  each  other  while  indirectly  connected  nodes  affect  each  other  through  the  propagation  effects  of  the 
continuous-time  dynamic  of  the  whole  network.  A  preliminary  implementation  of  this  mode!  in  a  parallel 
environment  is  presented. 

The  Transient  Eddy  Current  Problem  (A-(t)  Formulation) 

In  the  numerical  model,  the  magnetic  vector  potential  and  electric  scalar  potential  (A-({))  method  are 
used  to  formulate  and  solve  the  field  problem.  This  method  is  defined  by  the  following  equations: 
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fdA  ^ 

Vx(^' Vx /\)  =  -a  I  —  +  V<]) 
V.a(|.vJ=0 


where  A  and  are  the  magnetic  vector  potential  and  the  electric  scalar  potential,  respectively.  Jq  is  the 
magnetizing  current  density,  v  is  the  reluctivity  tensor,  and  a  is  the  conductivity.  The  application  of  the 
Galerkin  method  to  equations  (1)  and  (2)  gives  the  following  integral  equations: 

VW,  x(^.  V  X  /l)  +a  [I7+  '''I’  'jd'’  ^  I,,  ■  J«  dv  -  -  H  ds 


a  J  f^+  V(!)  ]■  VN,  ■  c/w  =  (,  y,  •  VJV,  dv  -  N,  ■  n  ■  J,  ■  ds 


where  Nj  are  the  interpolation  functions,  SI  is  the  boundary  surface  of  the  volume  S2  is  the  conductor's 
surface,/]  is  the  unit  vector  normal  to  S 1  and  S2,  and  H  is  the  magnetic  field  intensity. 

The  partial  derivatives  in  equations  (3)  and  (4)  can  be  replaced  by  values  of  magnetic  vector  potential 
components  at  the  nodes  of  a  finite  element  grid,  multiplied  by  an  appropriate  set  of  algebraic  coefficients.  For 
a  node  i,  in  this  grid,  the  corresponding  general  equations  governing  the  field  and  eddy  currents  are  as  follows 
after  implementing  the  boundary  conditions; 


-  +  Vd),  - 


^  a  A, 
~Yt 

;  =  lL  I  O  f 


+  V(!),  -ri,yo,  =0 


where  T],,  are  the  finite  element  coefficients,  are  the  magnetic  vector  potential  components,  J„i  is  the  external 
(excitation)  current  density,  NN  is  the  total  number  of  nodes,  and  u  designates  the  x,  y,  and  z  components. 

The  Crank-Nicolson  Technique 

The  time  in  equations  (5)  and  (6)  can  be  divided  into  increments.  Each  increment  has  a  duration  x, 
which  is  the  time  between  the  n  and  /i+l  time  instants.  Hence,  equations  (5)  and  (6)  can  be  approximated  as 
follows: 


2  ;=| 


2  ^ 
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=  0 


(8) 


After  algebraic  manipulations  and  rearrangement  of  the  terms  in  equations  (7)  and  (8),  one  has  the 
following  relationships: 


Z  j-\  T  z  J^l  L 


and 

a, 

T 


.;  =  l  ;  =  l  J  L  y=I  J  ^  /=! 


(9) 


(lO) 


The  State-Space  Technique 

Another  method,  used  for  computing  the  time  derivatives  in  equations  (5)  and  (6),  is  the  state-space 
approach.  These  equations  can  be  written  for  a  volume  containing  excitation  coils,  metallic  structures  and 
nonconducting  media.  It  should  be  pointed  out  that  the  excitation  winding  cannot  have  eddy  currents.  If  one 
designates  the  total  number  of  nodes  in  nonconducting  media  to  be  I,  the  total  number  of  nodes  within  the 
excitation  winding  to  be  m,  and  the  total  number  of  nodes  in  the  metallic  structure  to  be  k,  then  the  following 
can  be  written  in  a  matrix  form  after  proper  node  numbering  and  matrix  row  and  column  permutation 
operations: 


X 

X 

X 

(a„,.<|)) 

0 

0 

X 

X 

J 

X 

(A„,.<t>) 

= 

0 

/  \ 

- 

'^Om 

_ri^,(A'X/)  ri„(/:xm)  r|^,(^'x/:)J 

J 

_  0  . 

By  means  of  matrix  manipulations  one  has: 


A,„=[L,]"'L3-(A..,V<t>)  +  [L,]-'.J„.,  (I2) 

A„,=[L,]''  L,.(A,„,V$)  +  [L,r'.J„.,  (I3) 

where  the  matrices  L|  through  L4  are  definable  in  terms  of  the  finite  element  coefficients.  It  also  follows  that 
Aui;  is  governed  by  the  following  .system  of  ordinary  differential  equations: 

A..  =W.(A.,,V4,)  +  Z.J,„,  (I4) 

where  W  and  Z  are  matrices  defined  in  terms  of  the  finite  element  coefficients. 

Equations  (12)  and  (13)  are  basically  a  .set  of  algebraic  relationships  while  equation  (14)  is  a  set  of  first 
order  differential  equations.  This  last  equation  is  the  main  state-space  equation.  The  solution  of  equation  (14) 
followed  by  the  application  of  equations  (12)  and  (13),  results  in  the  magnetic  vector  potential  (MVP)  over  the 
whole  volume  under  consideration,  and  hence  the  other  field  variables  can  be  obtained.  The  solution,  how'ever. 
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hinges  upon  the  calculation  of  the  associated  transition  matrices.  These  matrices  allow  the  calculation  of  the 
instantaneous  values  of  the  state  variables  from  their  previous  values.  From  equation  (14),  one  can  write  a 
standard  recursive  relation  giving  the  MVP  vector,  A|,j[(n+l)x],  at  the  (n+l)  instant  of  time  in  terms  of  the 
vector  of  MVP  at  the  time  instant  A„jt(«)x  as; 


A„,  [(n  +  1)t ]  =  R ,  •  A„* [nx ]  +  R,  •  J„„. 


(15) 


where  T  is  the  time  increment  or  the  sampling  time  by  which  the  time  duration  is  divided  into  steps  of  length  x, 
and  R]  is  the  first  state  transition  matrix  given  as: 


R,  =U+T  W  + 


T  -  W“  X  ^ 


-  +  ■ 


(16) 


2!  3! 

and  R2  is  the  second  state  transition  matrix  which  contributes  the  influence  of  the  excitation  forcing  function 
into  the  solution.  This  matrix  is  calculated  directly  from  a  series  expansion  as: 


R, 


X  U  + 


X  '  W  X  ^ 

2!  ^  3! 


X 

+  - 


4 

4! 


z  =  w-'(r,-u)z 


(17) 


It  is  assumed  that  material  properties  remain  lime  invariant  between  two  integration  steps.  In  equations 
(16)  and  (17),  U  is  the  identity  matrix.  It  should  be  mentioned  that  in  a  non-linear  transient  solution,  the  state 
transition  matrices  Ri  and  R2  are  updated  at  every  time  step. 


Time  Marching  Solution  Procedure 

In  order  to  accomplish  a  solution  of  the  instantaneous  magnetic  field  governed  by  equations  (12) 
through  (14),  in  a  volume,  one  would  proceed  according  to  the  following  steps. 

STEP  1  :  n  =  0;  set  all  initial  conditions. 

STEP  2  :  Calculate  the  excitation  vector  Jo„,(nx). 

STEP  3  :  Form  the  matrix  equation  (1 1)  and  equations  (12),  (13),  and  (14). 

STEP  4  :  Calculate  R|(«x)  and  R2(mX)  using  equations  (16)  and  (17). 

STEP  5  :  Calculate  at  the  (n-Hl)x  instant  of  time  the  vector  A„J(n+l)x]  using  equation  ( 15). 

STEP  6  :  Calculate  the  remaining  vectors  A„,[(/i+l)x]  and  A„„f(/i+l)  x]  using  equations  (12)  and  (13) 

STEP  7  :  Calculate  the  flux  densities  throughout  the  volume,  as  well  as  other  field  variables. 

STEP  8  :  Update  the  reluctivities  for  all  elements  in  the  magnetic  material  region. 

STEP  9  :  Set  the  increment  on  the  time;  set  n  =  n  +  1 . 

STEP  10  :  If  the  transient  duration  has  been  reached,  print  results.  If  not,  go  to  STEP  2. 
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Application  and  Results 


The  A-(|)  method  is  applied  to  a  practical  problem  that  comprises  an  exciting  coil  set  between  two  steel 
channels,  and  a  steel  plate  inserted  between  the  channels  as  shown  in  Figure  (1).  The  steel  plates  are  made  of 
nonlinear  material,  and  the  amplitude  of  the  excitation  current  rises  exponentially  with  time.  Solving  the 
problem  following  the  Crank-Nicolson  and  State-Space  methods,  two  sets  of  solutions  were  obtained.  The 


results  are 


compared  with  each  other  and  also  with  measurements. 
This  comparison  is  carried  out  for  the  magnetic  flux 
density  and  eddy  current  density  values  at  the  specified 
test  points  in  the  model.  The  test  points’®'  for  the 
comparison  of  the  magnetic  flux  density  values  are 
designated  as  Sl(  0.0  <  x  <  1.6  mm,  0.0  <  y  <  25.0  mm, 
z=0.0  mm),  S2  (  X  =41.8  mm,  0.0  <  y  <  25.0  mm,  60.0  < 
z  <  63.2  mm),  and  S3  { 1 22. 1  <x  <  1 25.3,  0.0  <  y  <  25,0, 
z=  0.0  mm).  Similarly,  the  test  points  for  the  comparison 
of  the  eddy  current  densities  are  identified  as  PI 
(1.6,6.25,0.0mm),  P2  (41.8,6.25,63.2  mm),  and  P3 
(125.3,6.25,0.0).  In  Figures  (2)  and  (3),  the  measured 
magnetic  flux  density  values  are  compared  with  Crank- 
Nicolson  and  State-Space  results,  respectively.  The 
same  comparison  is  displayed  in  Figures  (4)  and  (5)  for 
the  eddy  current  density  values. 
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Figure  2.  Comparison  of  measured  flux  density  values  and 
Crank-Nicolson  results. 


Figure  3.  Comparison  of  measured  flux  density  values  and 
State-Space  results. 


T-Q  Method  for  Transient  Eddy  Current  Problems 

In  this  method,  the  electric  vector  potential  (T)  and  magnetic  scalar  potential  (Q)  are  used  as  unknowns 
to  solve  the  3D  nonlinear  transient  eddy  current  problem.  Results  of  implementation  using  first  and  second 
order  hexahedral  finite  elements  are  given.  The  .same  example  utilized  above  with  the  A-(})  method  is  used.  In 
the  steel  plates.  Figure  (1),  T  and  Q  are  used.  Outside  the  plates,  only  H  is  used.  T  and  Q  are  defined  as 
follows: 
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Figure  4.  Comparison  of  measured  eddy  current  density 
values  and  Crank-Nicolson  results. 


Figure  5.  Comparison  of  measured  eddy  current  density  values 
and  State-Space  results. 


H  =  T-Vi2+Hs 

in  Ve 

(18) 

J  =  VxT 

in  Ve 

(19) 

H  =  H,  -va 

in  Va 

(20) 

where  V,.  and  Va  denotes  eddy  current  regions  and  eddy  current  free  regions  respectively.  The  magnetic  field 
intensity  H.,  due  to  the  coil  in  free  space  is  computed  before  solving  the  field  equations  by  employing  Biot- 
Savart  law.  The  field  equations  are  derived  from  the  potentials  as  follows; 


VxpVxT-V  pV  T  + 


aix(T-VQ) 


dt 


dt 


^  3^i(T-VQ) 


V- 


a  t 

a^(~  vq) 

dt 


=  - V 


dt 
dt 


in  Ve 


in  Ve 


in  Va 


(21) 

(22) 

(23) 


After  using  vector  identities  and  Gauss’  theorem,  the  weak  form  of  the  field  equations  are  expressed  as: 


r  (  afl(T-VTi)l 

l^^|vxN^.pVxT  +  N,-t^-g^^ - 1. 


r  , 


a, 


dt 


d  t 


(24) 

(25) 

(26) 
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In  a  simplified  way,  equations  (24)  through  (26)  could  be  rewritten  as: 

M(h)^+Ku=-M{h)^  inV,  +  V.  (27) 

of  of 

where  u  is  the  solution  vector. 

For  the  time  differential,  the  step  by  step  method  is  employed.  Di.scretizing  the  solution  u  with  time  as 
the  independent  variable,  and  using  a  weighted  residual  approach'*”',  we  can  express  equation  (27) 
approximately  as; 


Af 


+  Ke 


Af 


1-0 


I  Af 


H 


M(n) 

Af 


(28) 


where  0  is  a  parameter  relevant  to  weighting  function.  0  equals  1,  1/2  or  2/3  corresponding  to  the  Backward 
difference,  Crank-Nicolson  or  Galerkin  difference  schemes  respectively.  The  subscripts  n+l  and  n  stand  for  the 
adjacent  two  time  steps,  and  Af  =  t„+i  -t„. 

If  Newton  -  Raphson  iteration  is  chosen  to  treat  the  nonlinearity  of  equation  (28),  the  Jacobian  matrix 
will  not  be  symmetric.  Therefore,  when  ICCG  method  is  used  to  solve  the  equation,  the  computation  time  for 
each  ICCG  iteration  will  be  doubled  compared  with  the  case  for  a  symmetric  matrix.  Therefore,  In  each  time 
step,  a  relaxation  method  is  adopted  here.  The  values  of  element  permeability  (i  are  modified  iteratively  as 
follows: 


.ni+  1 


=  Ii''‘  +  wUi 


.  //!  +  1 


(29) 


where  O)  is  a  relaxation  factor,  m+1  denotes  the  current  iteration.  From  our  experience,  a  variable  relaxation 
factor  was  shown  to  be  less  time  consuming.  For  lower  saturation  levels,  a  relatively  high  value  for  w  to  start 
the  nonlinear  iteration,  say  to  =  1.0,  may  be  used.  The  value  may  be  reduced  after  5  or  6  iterations  to  ,  say 
_  0  5  Pqj-  higher  saturation  levels,  choose  a  lower  value  for  co  to  start  ,  say  co  =  0.5,  then  reduce  the  value 
gradually  to,  say  co  =  0. 1  .  In  general,  the  number  of  nonlinear  iterations  could  be  reduced  in  this  manner. 


Numerical  Results 

The  computations  were  carried  out  using  first  and  .second  order  hexahedral  elements.  The  time 
functions  of  the  average  flux  density  and  the  local  eddy  current  density  obtained  from  the  computations  are 
given  in  Figures  (6),  (7)  and  (8),  compared  with  the  measured  values'"’'^'”'  and  the  above  re.sults  using  A-tf) 
.scheme.  Section  1 ,  S I  denotes  the  section  of  the  plate  at  0  <  x  <  1 .6,  0<  y  <  25,  z  =  0  (mm)  as  defined  above. 
Positions  1  and  2  (PI  and  P2  defined  above)  stand  for  the  point  (1.6,  6.25,  0.0)  and  (41.8,6.25,63.2)  (mm)  in  the 
plates  respectively.  Figures  (9)  and  (10)  show  the  distribution  of  the  flux  densities  and  the  eddy  current 
densities  on  the  part  of  the  steel  plate  surface,  respectively. 

It  can  be  seen  that  the  calculated  results  agree  with  the  measured  results  basically.  However,  the 
computation  accuracy  with  T-Q  scheme  is  not  as  satisfactory  as  with  A-  (})  .scheme.  The  reason  may  lie  in  the 
treatment  of  the  interface  condition.  To  ensure  the  zero-normal  components  of  the  eddy  current  densities  on  the 
interface  between  the  conductor  and  nonconductor  regions,  the  tangential  components  of  T,  T,.  should  be  set  to 
zero  in  the  numerical  procedure  explicitly, 

In  the  implemented  example,  the  flux  densities  and  eddy  current  densities  change  rapidly  along  the 
direction  perpendicular  to  the  main  surface  of  the  steel  plate.  The  di.scrctized  errors  with  T,  =  0  are  more 
sensitive  to  the  mesh  density.  Thin  plate  stmcture  with  the  skin  effect  limits  the  fining  of  the  mesh  density 
along  the  thickness  of  the  plates.  When  the  saturation  level  of  the  steel  plates  is  low.  the  use  of  second  order 
elements  will  improve  the  computation  accuracy  effectively,  But  the  improvement  is  not  so  distinct  when  the 
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Figure  6  .  Average  flux  densities  on  section  SI. 
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Figure?.  Eddy  current  at  position  PI. 


Second  order  element 


First  order  element 
Results  in  [4] 


^  Meosured  results 


/ooooooooooooc-'y 

/o  oOOOOOOOOOO 
OOOOOOOOOOOO  \ 

>OOOOOC>C>C>  O  C>  (>  oy  <  «  ^ 

o  o  cT  cT c>  C> 


Figure  9.  Flux  densities  on  the  plate  surface 
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Figure  8.  Eddy  current  densities  at  position  P2.  \ 

saturation  level  increases.  The  reason  is  that  for  the  jAx 

same  node  number  with  first  order  element  mesh,  the 

second  order  element  mesh  takes  less  number  of  ^ 

elements.  Therefore,  the  description  of  the  material 
characteristics  is  less  accurate  with  the  second  order 
element.  The  number  of  ICCG  iterations  with  second 

order  elements  is  larger  than  with  first  order  elements.  Figure  10.  Flux  densities  on  the  plate  surface 

This  is  because  not  only  the  non-zero  entries  of  the 

coefficient  matrix  increase,  but  also  the  distribution  of  the  entries  is  more  scattered.  Therefore,  the 
characteristic  of  the  matrix  gets  worse.  Taking  the  solution  of  the  last  iteration  as  the  first  guess  for  the  current 
iteration  could  reduce  the  number  of  iterations  greatly  in  the  ICCG  procedure. 

ISP  Method  and  Formulation  for  Eddy  Current  Problems 

Another  new  technique  for  solving  the  electromagnetic  field  equations  and  calculating  the  eddy  current 
density  values  is  ba.sed  on  the  Iterative  Scalar  Potential  (ISP)  formulation.  The  procedure  is  simple  and  is 
outlined  as  follows: 


Figure  10.  Flux  densities  on  the  plate  surface 
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VxH  = 
VxE  = 


VB  =  0 


30) 


V-D  =  0 


(31) 


aB 

where  J,,  =  J,.  +<?  E  J,„  —  — 

at 

Here,  J,,  and  J„,  are  unknowns,  and  J<  is  the  conduction  current  density.  The  magnetostatic  field  is 
described  as: 

VxH  =  J^.  V-B  =  0  (32) 

A  scalar  potential  can  be  used  to  solve  equation  (32)  and  the  equivalent  integral  form  is; 

( ( VN, )( [L  )^^dv  =  I  (VN, )  •  B,  dv  ds  (33) 

Be  in  equation  (33)  can  be  determined  based  on  J,  as: 

B  =  -^VcI>+B.  (34) 

If  J,.  is  assumed,  then  B  can  be  solved  using  a  scalar  potential  in  a  manner  similar  to  equation  (33). 
Hence,  J„  can  be  determined  and  E  is  obtained  using  a  scalar  potential  where  the  method  of  solving  equation 
(31)  is  similar  to  the  method  of  solving  equation  (32).  The  above  equations  can  be  written  in  an  iterative  form 
as  follows: 


Vx//;"’  =  7,.+a 

(35) 

VxE;"'  =-co  ju 

(36) 

Vx//;"'  =a 

(37) 

Vx£*"'  =  CO  ILL 

(38) 

where  the  subscripts  r  and  i  indicate  real  and  imaginary  parts  of  complex  field  quantities  and  the  superscript  k 
designates  the  step  of  iteration.  For  fast  convergence  of  the  iterative  process,  accelerating  coefficients  are 
used.  These  coefficients  are  determined  from  the  calculation  expressions  used  to  obtain  the  (k+\f  step.  When 
determining  the  (A+l)'"  step  for  £*^',  and  the  required  values  of  £* ,  £*^',  are 

calculated  as  follows,  respectively: 

(39) 

C2 

(40) 

+E;)-C3 

(41) 

=(/y/"'  +//Mc4 

(42) 
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where  Cl,  C2,  C3,  and  C4  are  accelerating  coefficients  recalculated  throughout  the  iterative  process.  This 
procedure  is  repeated  for  ^  =  1,2,  N  and  the  final  solutions  are;  5  =  and  £  =  C'a'. 

Application  and  Results 

The  ISP  method  is  applied  to  an  aluminum  bar  example 
shown  in  Figure  (11).  This  bar  was  surrounded  by  a  coil 
containing  987  turns  and  was  excited  by  various  values  of 
excitation  current  at  60  Hz  and  120  Hz.  Other  geometrical 
details  regarding  this  example  are  given  in  the  following  section. 

Table  I.  show  a  comparison  between  the  measured  and 
calculated  values  of  eddy  current  losses  in  the  bar  at  60  Hz. 

Table  1.  also  include  a  comparison  with  the  numerical  results 
obtained  using  the  A-i})  method  explained  above.  As  can  be  seen 
from  the  results  in  the  Table  1 .  follow  the  expected  patterns  of 
the  current  squared  law  and  are  in  good  correlation  with 
solutions  from  the  A-(j)  method  as  well  as  with  experimental 
data.  From  our  experience  with  this  method,  the  advantages  of 
this  method  are;  I)  The  numerical  problem  is  solved  in  terms  of 
the  field  variables  directly  which  has  a  physical  meaning,  and  2) 
sizable  reduction  in  CPU  lime  to  obtain  the  solution  in 
comparison  to  other  methods.  The  numerical  results  of  this  new  technique  are  obtained  in  less  than  1/2  of  the 
time  consumed  utilizing  the  A-(Ji  method. 

The  solutions  compared  here  are  linear  and  the  coil  excitation  is  sinusoidal  and  this  method  is  not 
attractive  for  nonlinear  transient  eddy  current  problems.  Nonlinear  solutions  utilizing  this  ISP  method  was 
performed  on  a  transformer  problem  and  were  reported  in  reference  [6]. 

Table  1.  Comparison  between  experimental  an  numerical  values  of  eddy  current 
losses  (W)  for  the  aluminum  bar  example 


1.0  A 

1.5  A 

2,0  A 

Laboratory  Results  (60  Hz) 

0.5800 

1.5600 

3.5200 

Numerical  A-ij)  Method 

0.7480 

1.6730 

2.9740 

ISP  Method  (60  Hz) 

0.6604 

1.5379 

2.9927 

ISP  Method  (120  Hz) 

1.1842 

2.6525 

5.4267 

Neural  Computation  of  the  2-D  Eddy  Current  Problem 

The  above  methods  of  solving  electromagnetic  field  and  eddy  current  problems  involve  computation 
procedures  which  utilizes  the  serial  nature  of  computer  systems.  Recently  some  researchers  are  exploring  neural 
network  and  parallel  procedures  to  solve  electromagnetic  problems  more  efficiently  and  with  a  high  degree  of 
robustness.’"  These  recent  developments  have  followed  the  important  work  of  Hopfield  '  in  optimization  in 
addition  to  several  other  works  in  optimizing  electromagnetic  devices.  The  developments  here  contributes  to  these 
continuing  efforts  by  suggesting  an  architecture  that  provide  a  parallel  computation  alternative  for  a  variety  of 
problems  in  electromagnetics.  This  procedure  is  also  applicable  to  any  problem  which  can  be  modeled  by  partial 
differential  or  integro-differential  equations  or  any  sparse  .system  of  equations. 
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Description  of  the  Method 


The  ability  to  preserve  the  parallel  processing  I 

nature,  the  continuous-time  dynamics,  as  well  as  global  - Q - Q - Q - - 

interaction  of  network  elements  is  of  great  interest  in 
solving  eddy  current  problems.  A  parallel  architecture  for 

eddy  current  problems  is  formed  with  nodes  which  are  A  A  A  A 

fully  interconnected  as  shown  in  Figure  (12).  Each  node  is  Y  j  y  y 

modeled  with  a  set  of  circuit  elements  as  determined  by  the 
governing  equation.  The  interconnection  between  the 

nodes  is  produced  through  weights  similar  to  dependent  - Q - Q - Q - Q - 

sources  in  an  electrical  circuit  according  to  a  specific  rule. 

The  directly  connected  nodes  could  interact  directly  with 
each  other  while  indirectly  connected  nodes  affect  each 

other  through  the  propagation  effects  of  the  continuous-  pig^re  12,  A  2-D  neural  system  for  eddy  current 
time  dynamic  of  the  whole  network.  In  general,  the  calculations 

interconnection  can  be  of  any  dimension.  A  node  of  any 

row  or  column  (i,j)  process  information  through  a  function  derived  from  the  governing  field  equation.  To 
determination  the  eddy  currents  in  a  two  dimensional  system,  the  governing  equation  is  as  follows; 


V  X  (vV  X  A)-a 


0A  o  f  3  A  , 


where  V  is  the  material  reluctivity,  A  is  the  magnetic  vector  potential,  a  is  the  conductivity,  a  is  the  cross  sectional 
area  of  the  exciting  conductor,  S  is  the  surface,  and  J  is  the  measured  current.  If  equation  (43)  is  discretized  using 
two  dimensional  finite  differences,  one  obtains  the  following  for  each  node. 

*'''1  A,;  +  l  +  A  +  lj-  '^'4  A-i,;  “  (''^1  +  ^^2  +  +  '^^4  )Ay 

dt  a  k 

where  vt's  are  obtained  from  the  geometries  and  the  summation  over  k  relates  to  numbers  over  the  coil  area. 
Equation  (44)  discretizes  the  function  in  space  with  inter-connected  nodes,  and  uses  the  continuous-time 
dynamics  to  simulate  the  behavior  of  the  magnetic  vector  potential.  Each  node  in  Figure  (12)  is  modeled  by  the 
cell  circuit  of  Figure  ( 1 3).  The  values  of  the  circuit  elements  are  given  as  follows: 

Ml  ^  3A,.  , 

4,.,  /s,=-y- 


VV'i  -I-  H’,  -I-  U'3  -t-  1V_, 


c,  =0,4,,  C,  =A„  C,  =  ' 


This  parallel  interaction  on  the  nodes  make  the  process  much  faster  than  the  sequential  process.  The  above 
procedure  is  implemented  on  two  examples.  In  these  applications,  we  consider  purely  sinusoidal  excitation  and 
hence,  complex  representation  of  all  field  variables  is  possible.  The  d/di  term  can  be  replaced  by  jco.  If  the  eddy 
cuiTent  in  the  exciting  coil  is  considered,  all  the  terms  of  equation  (44)  will  be  considered.  The  results  are 
compared  with  solutions  from  standard  finite  element  analysis. 
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Example  and  preliminary  Results: 

The  example  described  in  Figure  (11)  is 
that  of  an  aluminum  bar  (2.54  cm  x  2.54  cm  x 
20.32  cm)  inserted  in  a  coil  (8.128  cm  long  and 
0.32  cm  thick)  as  shown  in  Figure  (3).  The  coil  has 
987  turns.  We  will  compute  this  problem  with  1.0 
A,  1.5  A  and  2.0  A  rms  currents  at  60,  120  Hz 
frequency,  Other  frequencies  of  200  Hz  and  1000 
Hz  will  also  be  tested.  The  eddy  current  losses  in 
the  coil  is  neglected. 

Figures  (14)  and  (15)  show  results  of  the 
aluminum  bar  example  by  using  this  neural 
parallel  approach  and  by  standard  FE  analysis  for 
2.0  A  at  60  Hz.  Table  2.  show  the  results  of  eddy 
Figure  13.  A  cell  circuit  for  each  node  current  loss  at  1.0  A,  1.5  A  and  2.0  A.  The  2-D 

eddy  current  loss  data  at  60  Hz  shown  in  Table  2. 


could  be  also  compared  with  the  3-D  finite  element  results  obtained  using  the  ISP  and  the  A-<{)  methods  above. 
These  results  are  in  good  comparison  with  the  results  obtained  in  the  previous  section  for  the  same  example. 


Table  2.  Comparison  of  solutions  at  various  currents  for  the  aluminum  bar 


Parameter 

2-D  Finite  Element  Solution 

Neural  Solution 

1=  1.0A,f=60  Hz 

Eddy  current  loss  (W) 

0.6742 

0.6731 

1=  1.5  A,  f=60  Hz 

Eddy  current  loss  (W) 

1.5938 

1.5914 

I  =  2.0  A,  f=60  Hz 

Eddy  current  loss  (W) 

2.8845 

2.8862 

1  =  2.0  A,  f=12()Hz 

Eddy  current  loss  (W) 

5.3389 

5.3415 

Figure  1 4.  MVP  lines  from  FE  solution  at  60  Hz  Figure  15.  MVP  lines  from  Neural  solution  at  60  Hz 
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Conclusion 

This  paper  presented  an  variety  of  formulations  and  computational  techniques  for  eddy  current 
problems  in  electrical  devices.  Techniques  for  two  dimensional  and  three  dimensional  cases  are  presented 
utilizing  vector  and  scalar  formulations.  Both  sinusoidal  and  transient  variations  of  excitation  systems  in  both 
the  linear  and  nonlinear  cases  were  treated.  The  implemented  examples  showed  how  the  various  techniques  are 
utilized.  Due  to  the  serial  nature  of  current  methods  of  eddy  current  analysis,  a  neural  network  model  was 
suggested  for  implementation  in  a  parallel  environment  with  a  simplified  example  in  two  dimensions. 
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Ten  Years  of  Evolution  of  the  FDTD-like  Conformal  Techniques 
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Abstract 

FDTD  has  been  criticized  for  its  inability  to  model  accurately  curved  boundary  electromagnetic  problems.  This 
shortcoming  has  been  addressed  by  the  FDTD  community  for  the  last  ten  years  or  so.  It  is  the  purpose  of  this  paper 
to  give  a  connecting  account  of  the  FDTD-derived  conformal  techniques  (lime  leapfrog  and  spatially  staggered)  and 
to  supply  some  motivation  leading  to  those  techniques.  The  paths  to  these  techniques  improving  the  original 
rectangular  FDTD  have  not  been  straight ,  however. 

Introduction 

The  present  author  published  the  FDTD  numerical  algorithm  to  solve  Maxwell’s  equation  in  rectangular 
coordinates  in  1966  [1].  He  did  no  further  work  on  it  until  1984  when  he  returned  to  the  Lawrence  Livermore 
National  Laboratory  to  work  on  microwave  coupling  problems  requiring  the  numerical  solution  of  Maxwell's 
equations.  Meanwhile  (1970-1980)  the  workers  on  BMP  (Kunz,  Holland,  Lee,  and  Merewelher  of  Mission  Research 
Corporation)  have  made  use  of  FDTD  and  have  introduced  many  necessary  auxiliary  results  to  make  the  FDTD 
algorithm  a  practical  tool  to  analyze  the  BMP  problems  [2-5].  In  1975  Taflove  introduced  the  name  FD-TD  and  has 
since  contributed  greatly  to  the  techniques  and  applications  of  FDTD  [6-8].  In  fact,  the  original  application  of 
FDTD  to  the  RCS  calculation  was  due  to  him  and  his  coworkers.  FDTD  as  a  practical  tool  seems  to  be  complete 
after  the  publication  of  the  radiation  boundary  condition  approximation  by  Mur  in  1981  [9],  Since  the  mid  1980,  the 
application  of  FDTD  to  solve  electromagnetic  problems  has  been  explosive.  The  number  of  citations  of  the  1966 
paper  [1]  by  journal  articles  is  now  well  over  600;  and  if  we  include  citations  of  orally  presented  papers,  the  number 
can  easily  top  1000!  But,  practically  all  these  applications  are  based  on  the  stair-casing  rectangular  FDTD 
algorithm. 

Meanwhile  our  colleagues  in  the  method  of  moments  (MOM)  and  in  finite  elements  (FE)  have  been  rightly 
pointing  out  the  fact  that  the  FDTD  would  have  trouble  modeling  curved  surfaces  other  than  some  special  cases 
where  a  coordinate  representation  of  the  surface  is  possible;  and  even  then  the  curvilinear  FDTD  would  have 
problem.  In  order  to  enlarge  the  scope  of  applicability  of  the  rectangular  FDTD,  Holland  published  the  FDTD 
algorithm  in  general  coordinate  systems  in  1983  [10].  The  curvilinear  algorithm  was  implemented  by  Fusco  in  1990 
[1  la,  1  lb].  The  main  shortcoming  of  the  Holland  algorithm  is  that  it  requires  a  coordinate  system.  In  1984  Yee  [12] 
and  Weiland  [13]  noticed  a  generalization  of  the  original  FDTD  through  the  surface-curve  integral  form  of 
Maxwell's  equations.  In  a  series  of  technical  notes  in  the  Lawrence  Livermore  National  Laboratory  and  through 
personal  conversations  with  workers  in  time  domain  electromagnetics,  Yee  (among  others)  has  urged  them  to  use  the 
Faraday’s  law  and  Ampere’s  law  in  the  orginal  experimental  forms  (surface-curve  integral  form)  as  he  has  found 
them  to  be  fruitful  in  the  microwave  coupling  problems.  Taflove  has  adopted  the  integral  form  and  derived  many 
useful  results  [14-16].  The  integral  form  of  Maxwell’s  equations  was  used  by  Holland  and  colleagues  in  deriving 
thin  wire  and  thin  slot  modifications  of  FDTD,  but  the  wide  use  of  the  integral  form  to  derive  numerical 
approximations  seems  to  have  started  in  its  earnest  since  1984. 

Presently,  workers  in  the  time  domain  FDTD  rouuncly  use  the  integral  form  to  derive  numerical  discretization  of 
the  Maxwell’s  equations.  With  the  help  of  the  integral  form,  Yee  was  able  to  show  that  it  was  possible  to  generalize 
FDTD  to  an  irregular  grid  (in  2-D  and  in  special  cases  in  3-D)  where  a  known  coordinate  system  is  not  necessary 
(Figure  4A).  But  he  encountered  difficulties  in  3-D  generalization.  This  new  approach  was  delivered  in  the  first 
ACES  meeting  in  March  1985  in  Livermore.  Among  those  present  was  T.  Jurgen  who  later  wrote  a  dissertation 
under  Taflove  on  the  application  of  the  integral  form  to  make  adjustment  of  FDTD  near  the  curved  boundary 
[15,16],  A  significant  breakthrough  in  generalizing  FDTD  for  a  general  unstructured  irregular  grid  was  introduced 
by  Madsen  and  Ziolkowski  [17,18]  when  they,  in  addition  to  using  the  surface-curve  integral  form,  also  employed 
the  volume-surface  integral  form  of  Maxwell's  equations.  In  their  work  the  main  tool  is  the  surface-curve  integral 
form  and  the  auxiliary  tool  is  the  volume-surface  integral  form  employed  for  "correction".  The  Madsen-Ziolkowski 
method  works  well  when  the  grid  is  nearly  orthogonal.  The  implementation  is,  however,  very  complicated  for  a 
general  grid;  and  Madsen  has  since  abandoned  it  in  favor  of  the  discrete  surface  integral  (DSI)  where  he  uses  the 
surface-curve  integral  form  and  averaging  [19].  Meanwhile  Yee  and  his  colleagues  attempted  to  generalize  the 
FDTD  with  overlapping  grids  generated  by  coordinated  patches  [20].  In  theory  this  method  would  work,  but  in 
practice  it  is  too  complicated  to  implement  for  general  problems.  Nevertheless,  the  overlapping-grid  FDTD  was 
found  effective  for  many  special  problems  and  it  can  be  very  genera!  for  2-D  scattering  by  smooth  cylinders  as  in 
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this  case  a  local  tangent-normal  orthogonal  grid  can  be  introduced.  In  the  search  for  further  generalization  of  the 
overlapping-grid  FDTD  and  with  the  desire  to  make  use  of  surface  triangular  grids  such  as  employed  by  MOM 
analysts,  Yee  made  use  the  volume-surface  integral  form  of  Maxwell's  equations  in  addition  to  the  surface-curve 
integral  form.  This  is  the  FDTD/FVTD  hybrid  technique  [21].  The  FDTD/FVTD  hybrid  is  now  quite  well 
developed  and,  to  the  opinion  of  the  present  author,  it  is  rather  easy  to  implement. 


The  Evolution  of  the  FDTD-like  Algorithm 

Starting  from  the  Maxwell's  equations  in  partial  differential  equation  form: 

^  r^F 

one  can  apply  the  standard  central  difference  to  derive  the  following  finite  difference  equations  (refer  to  Figure  7A 
and  [1]  for  notations): 

H^*^^^(iJ  +  l/2.k  +  i/2)=  H^{i,j+1/2.k  +  \/2) 


^(<,>  +  1  /2,^  +  l/2)Ax 


[E"{iJ  +  \/2,k  +  l)-  EUiJ  +  \/2.k) 


+E^0.j,k  +  \/2)-E"(iJ  +  \,k  +  \/2)\ 


and  other  equations. 

In  Figures  la  and  lb  we  show  Maxwell's  equations  in  surface-curve  integral  form,  together  with  the  verbal 
description  of  these  laws.  The  verbal  statements  are  copied  directly  from  Schelkunoffs  last  book  [22]  where  he 
urged  the  reader  to  consider  Maxwell's  equations  in  their  integral  forms.  The  original  FDTD  equations  can  be 
derived  directly  from  these  two  sets  of  equations  if  one  chooses  the  area  to  be  the  faces  of  the  cubes  (Fig.  7A).  But 
the  expression  in  Figures  la  and  lb  are  more  general.  In  fact  the  area  vector  A  is  well  defined  even  if  the  curve  is 
not  planar.  The  approximate  discretized  equations  can  be  derived  if  one  pays  a  little  attention  to  the  mean-value 
theorem  of  vector  integrals.  Roughly,  the  curve  dA  can  not  be  too  weird  and  the  area  enclosed  by  it  should  not  be 
too  odd.  The  data  along  9A  should  be  sufficient  to  allow  an  accurate  evaluation  of  the  line  integral.  Knowing  a 
little  on  the  electromagnetic  field  behavior  will  help  to  ascertain  the  limitation  of  this  approximation.  In 
neighborhoods  of  thin  wires  and  thin  gaps  these  approximations  need  to  be  modified  to  reflect  the  singular  behavior 
of  the  electromagnetic  field.  Depending  on  what  curve  is  chosen  and  depending  on  where  the  field  variables  are 
located,  these  integral  forms  provide  great  flexibility  to  generate  discretized  equations.  Maxwell's  equations  are 
very  symmetric  in  E  and  H.  The  discretized  equations  to  update  E  and  H  should  be  as  similar  as  possible.  Such 
symmetry  has  been  attained  in  the  original  FDTD.  In  deriving  discretized  equations,  one  only  needs  to  evaluate 
approximately  a  line  integral  and  a  surface  integral.  We  will  call  the  curve  where  the  line  integral  involving  the 
electric  field  E  is  evaluated  the  electric  contour  and  similarly  for  a  magnetic  contour.  The  grid  or  grids  chosen  for 
the  application  ofthe.se  integral  forms  should  have  the  following  essential  properly  (Figure  2a,  Figures  7  A,  B,C): 
Each  edge  associated  with  an  electric  field  component  ought  to  have  a  magnetic  contour  enclosing  it  (right 
hand  rule)  and  each  edge  associated  with  a  magnetic  field  component  ought  to  have  an  electric  contour 
enclosing  it  (left  hand  rule). 

The  rectangular  FDTD  grid,  the  Holland  generalized  FDTD  grid,  and  the  Madsen  DSI  grid  all  have  the  above 
property.  For  accurate  approximation,  one  should  "center"  the  variables  with  respect  to  the  contours  and  the  areas. 

Equivalent  to  the  surface-curve  integral  forms  there  is  the  volume-surface  integral  forms  of  the  Faraday  and 
Ampere  laws  (Figures  Ic,  2b,  7D).  The  discretization  involves  no  more  than  the  evaluation  of  volume  and  surface 
vector  integrals.  Again  accuracy  depends  on  the  locations  of  the  field  variables. 

We  shall  from  now  on  referred  to  the  updating  of  the  field  variables  by  the  surface-curve  integral  form  of 
Maxwell's  equations  as  FDTD  (finite  difference  time  domain,  generalized)  and  the  updating  of  the  field  variable  by 
the  volume-surface  integral  form  of  Maxwell's  equations  as  FVTD  (finite  volume  time  domain). 

To  see  how  various  schemes  are  related,  we  would  like  to  introduce  some  terminology  of  the  original  FDTD  grid 
(Figure  7A).  The  FDTD  cubes  will  be  referred  to  as  electric  cubes.  The  totality  of  electric  cubes  form  an  electric 
grid.  The  vertices  (edges,  faces)  of  these  cubes  will  be  referred  to  as  electric  vertices  (edges,  faces)  The  component 
of  the  electric  field  along  an  electric  edge  is  located  at  the  middle  of  the  electric  edge.  If  we  connect  the  centers  of 
two  cubes  having  a  common  face,  w-e  obtain  another  grid  known  as  the  magnetic  grid  (Figure  Ic,  10).  The  cubes  of 
the  magnetic  grid  are  the  magnetic  cubes  and  so  on.  The  component  of  the  magnetic  field  along  a  magnetic  edge  is 
located  at  the  middle  of  a  magnetic  edge.  The  boundary  of  a  magnetic  (electric)  face  forms  a  magnetic  contour 
enclosing  an  electric  (magnetic)  edge.  Wc  ob.serve  that  the  area  vector  as.sociated  with  a  magnetic  contour  is  along 
the  same  direction  as  the  electric  edge  in  an  orthogonal  coordinate  system.  Imagine  now  the  rectangular  electric  grid 
is  distorted  in  a  non-orthogonal  coordinate  grid.  We  would  have  the  same  topology  as  the  rectangular  grid.  This  is 
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the  grid  Holland  used  (Figure  7B).  There  is  a  difference  between  a  general  non-orihogonal  grid  and  the  rectangular 
grid.  The  direction  of  the  area  vector  enclosed  by  a  magnetic  (electric)  contour  may  not  be  in  the  same  direction  as 
that  of  the  electric  (magnetic)  edge  this  contour  is  associated  with.  To  evaluate  the  electric  contour  integral  we  need 
the  component  of  the  electric  field  along  the  edges  (which  we  assume  we  have).  The  component  of  an  electric 
(magnetic)  field  associated  with  an  electric  (magnetic)  edge  is  updated  along  the  area  vector  associated  with  the 
magnetic  (electric)  contour.  This  area  vector  is  not  along  the  direction  of  the  electric  edge  unless  the  coordinate  is 
orthogonal.  Thus,  in  order  to  obtain  the  component  of  the  electric  vector  along  an  electric  edge,  Holland  had  to  first 
calculate  the  electric  vector  at  the  middle  of  an  electric  edge.  FDTD  will  yield  the  component  along  the  area  vector. 
Thus  one  relation  of  the  three  rectangular  components  at  the  middle  of  an  electric  edge  is  known;  and  the  other  two 
relations  must  come  from  interpolations  (Figures  8B  and  13b).  This  interpolation  process  is  only  possible  if  there  is 
a  smooth  coordinate  system  and  of  course  the  computational  elements  are  small  compared  to  wavelength.  When  the 
coordinate  system  is  nearly  orthogonal,  the  direction  of  an  electric  edge  and  the  area  vector  of  the  associated 
magnetic  contour  will  be  nearly  the  same.  This  is  the  basis  of  the  Madsen-Ziolkowski  method.  They  updated  the 
component  of  the  electric  field  along  the  magnetic  area  vector  and,  unlike  Holland,  they  updated  the  other  two 
relations  (corrections)  by  FVTD.  The  process  is  very  awkward.  The  latest  Madsen  DSI  algorithm  is  shown  in 
Figure  8C  and  Figure  13c.  It  is  much  simpler  In  this  algorithm  the  FVTD  is  discarded  and  the  component  of  the 
electric  field  along  an  electric  edge  by  the  updated  FDTD  data  with  an  averaging.  The  DSI  of  Madsen  is  a 
generalization  of  the  FDTD  and  Holland's  FDTD  in  that  one  does  not  need  a  coordinate  system  for  computation.  In 
fact  (referring  to  Figure  13c)  it  is  only  required  that  the  magnetic  area  vectors  associated  with  the  edges  a,  b,  and  c 
respectively  to  be  linearly  independent.  This  algorithm  is  applicable  in  an  unstructured  grid.  The  FDTD  updating 
and  the  averaging  for  the  field  variables  seems  to  be  more  complicated,  however,  when  compared  with  the 
overlapping  FDTD/FVTD  advanced  by  Yee.  The  composite  FVTD  grid  consists  of  two  grids-an  electric  grid  and  a 
magnetic  grid.  At  the  electric  vertices  are  located  the  elecuic  vector  and  at  the  magnetic  vertices  are  located  the 
magnetic  vector.  The  magnetic  vertices  should  be  near  the  centers  of  the  electric  element  (distorted  cubes,  prisms, 
or  any  other  solid  "elements")  and  the  electric  vertices  should  be  near  the  centers  of  the  magnetic  elements  (Figures 
Ic.lO).  The  electric  vector  and  the  magnetic  vector  arc  updated  by  the  FVTD  algorithm  (Figure  2b). 

It  is  the  present  author's  habit  to  assign  electric  edges  or  electric  vertices  at  the  boundary  of  a  computational  grid. 
The  data  at  the  boundary,  whether  it  is  a  physical  boundary  or  a  computational  boundary,  can  not  be  updated  by 
either  the  FDTD  or  FVTD  algorithm  because  the  necessary  magnetic  contour  or  the  magnetic  surfave  would  not  be 
contained  in  the  computational  volume.  The  data  at  the  physical  boundary  is  obtained  with  the  help  of  the  boundary 
condition  and  the  data  in  the  outer  computational  volume  is  obtained  with  the  help  of  the  radiation  boundary 
condition  simulation.  These  simulations  will  not  be  discussed  here.  Shown  in  Figure  3  are  three  possible  grids.  The 
stair-casing  grid  is  for  the  rectangular  FDTD,  the  locally  distorted  grid  can  be  used  by  DSI  and  by  FVTD,  and  the 
overlapping  conformal  grids  can  be  used  by  the  FDTD/FVTD  hybrid.  In  the  overlapping  grid,  the  outer  boundary  is 
the  computational  boundary  and  it  is  also  the  outer  boundary  of  the  rectangular  grid.  The  inner  boundary  of  the 
rectangular  grid  is  one  to  two  "zones"  away  from  the  scattering  object.  The  body  conformal  grid  consists  of  several 
layers  of  prisms  (Figure  10).  In  the  interior  field  points  of  the  rectangular  grid  we  use  the  FDTD  algorithm  for 
updating,  and  in  the  interior  vertices  of  the  conformal  grid  we  use  the  FVTD  algorithm  for  updating.  The  data  at  the 
outer  boundary  of  the  rectangular  grid  is  fixed  with  the  help  of  radiation  condition  simulation  and  the  data  at  the 
physical  boundary  (the  scatlerer)  is  fixed  with  the  help  of  the  physical  boundary  condition.  The  data  at  the  interior 
boundary  of  the  rectangular  can  be  obtained  through  interpolation  of  the  calculated  data  of  the  conformal  grid 
(Figure  12),  and  the  data  at  the  outer  boundary  of  the  conformal  grid  can  be  obtained  through  interpolation  of  the 
calculated  data  of  the  rccmngular  grid  (Figure  12).  The  overlapping  allows  a  systematic  interpolation.  The  price 
paid  over  a  single  distorted  grid  is  the  double  interpolations  at  each  time  step.  However,  one  can  obtain  a  conformal 
3-D  grid  for  the  overlapping -grid  from  a  surface  grid,  whereas  a  distorted  grid  requires  a  full  fledge  3-D  grid  which 
is  more  difficult  to  generate.  One  can  use  the  FDTD/FVTD  hybrid  with  a  single  distorted  grid  (the  grid  shown  in 
Figure  5  is  the  electric  grid,  and  there  is  a  corresponding  magnetic  grid  not  shown).  Holland  uses  the  grids  Fig.  5A 
and  Fig.  5B  to  perform  calculations  with  the  Madsen-Ziolkowski  scheme  [23],  whereas  we  use  the  grid  shown  in 
Figure  5C  with  our  FDTD/FVTD  for  the  scattering  calculation  by  a  circular  cylinder  [24].  For  2-D  problems  the 
FDTD/FVTD  is  very  flexible  as  we  showed  with  the  possible  grids  used  in  Figure  6  [24]. 

The  Madsen-Ziolkowski  modified  finite  volume  technique,  the  Madsen  discrete  surface  integral  technique,  and  our 
FDTD/FVTD  overlapping  grid  technique  sometime  encounter  late  time  growth.  There  exist  two  ways  to  remedy 
this  instability.  Riley  and  Turner  [25]  employ  a  lime  averaging  of  the  magnetic  vector  H  and  have  found  stability. 
In  our  work  with  the  FDTD/FVTD  hybrid,  we  find  stability  by  employing  an  averaging  of  the  rectangular  grid  data 
and  the  conformal  grid  data  for  the  magnetic  field  in  the  overlapping  region  [26].  Recently  papers  showing 
improvement  of  the  FDTD  and  the  FVTD  have  also  been  published  by  Vinokur  and  Yarrow  [27]  and  by  Liu  [28a, 
28b]. 
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There  also  a  exists  locally  modified  rectangular  grid  of  Jurgen-Taflove  [14,  16],  and  Fang-Ren  [29]  as  shown  in 
Figure  9.  In  these  grids  the  rectangular  grid  is  retained;  the  locations  of  the  rectangular  field  components  arc  the 
same  as  that  in  the  rectangular  FDTD.  The  FDTD  (generalized)  is  used  to  update  some  variables  near  the  boundary. 
However,  near  the  boundary,  some  field  variables  are  obtained  through  extrapolation  and/or  interpolation.  The 
present  author  tried  this  technique  but  gave  up  because  of  the  bookkeeping  complexity.  The  quoted  authors 
apparently  have  overcome  (at  least  partially  if  not  completely)  the  bookkeeping  nightmare  and  produced  some 
calculations  showing  the  improvement  over  the  stair-casing  grid. 

Recently,  an  FDTD-Iike  technique  employing  the  Whitney  elements  has  been  suggested  by  Yee  and  exploited  by 
Chan  et.  al.  [30].  Tlie  idea  is  illustrated  in  Figure  14  for  the  2-D  TM  waves.  The  explanation  in  3-D  is  simple  (in 
principle).  Imagine  the  computational  space  to  consist  of  leU'ahedrals.  We  assign  the  electric  field  component  at  the 
middle  of  the  edge  of  the  tetrahedral,  and  the  normal  component  of  the  magnetic  field  at  the  centers  of  the  faces  of 
the  tetrahedral.  The  electric  contours  arc  the  boundary  of  the  faces  of  the  tetrahedral.  Knowing  the  electric  field 
components  at  the  middle  of  the  edges,  the  nonnal  component  of  the  magnetic  field  at  the  centers  of  the  faces  can  be 
updated  by  means  of  FDTD.  The  difficulty  ( this  is  the  difficulty  the  present  autlior  did  not  know  how  to  overcome 
for  3-D  in  1984  to  recently  until  he  heard  of  the  Whitney  elements  through  J.  F  Lee  in  1992)  is  how  to  make  use  of 
the  normal  component  of  the  magnetic  field  on  the  faces  of  a  tetrahedral.  The  Whitney  "face"  elements  are  the 
vector  interpolates  defining  a  vector  field  throughout  the  whole  tetrahedral  having  the  same  normal  components  at 
the  centers  of  the  faces,  For  a  given  electric  edge  there  are  tetrahedrals  sharing  this  edge.  The  magnetic  vector  is 
defined  throughout  these  tetrahedral.  For  a  given  electric  edge,  one  can  construct  a  surrounding  magnetic  contour 
which  lies  inside  the  collection  of  teuahedral.  and  which  encloses  an  area  with  the  area  vector  in  the  direction  of  the 
electric  edge.  The  electric  field  component  along  this  edge  can  now  be  updated.  A  variant  of  this  method  can  be  our 
FDTD/FVTD  hybrid  because  the  Whitney  elements  allow  us  to  define  a  magnetic  vector  at  the  center  of  each 
tetrahedral. 

Other  Time  Domain  Methods 

Our  finite  element  colleagues  have  recently  been  very  active  in  the  time  domain  method  of  solving  Maxwell’s 
equations.  In  1984  Cangcllaris  et.  al.  [31a, b]  published  the  point  matching  finite  element  time  domain  method. 
Because  they  matched  Maxwell's  equations  at  the  nodes,  they  obtained  an  explicit  system  of  ordinary  differential 
equations.  But  most  of  the  recent  time  domain  finite  element  numerical  equations  are  derived  from  a  Galerkin 
process.  Also  the  Whitney  elements  seemed  to  be  preferred.  Mur  [32,33]  used  the  Whitney  edge  elements  to 
represent  the  electric  field  and  the  Whitney  face  elements  to  represent  the  magnetic  field.  He  then  uses  a  Galerkin 
weighing  procedure  to  generate  a  linear  system  of  ordinary  equations  to  solve  for  the  time  dependent  expansion 
coefficients.  Lee  [34]  uses  the  Whitney  edge  elements  alone  as  expansion  functions  and  uses  the  Galerkin  method 
in  the  weak  form  of  the  vector  wave  equation  satisfied  by  the  electric  vector  to  derive  a  system  of  ordinary  different 
equations.  Mahadevan  and  Mittra  [35,  36],  following  the  work  of  Mur,  have  obtained  results  with  the  Whimey  edge 
and  face  elements.  All  these  methods  require  an  inversion  of  a  matrix.  However,  they  are  able  to  give  criteria  for 
the  numerical  stability  of  their  methods,  whereas  for  the  DSl  and  our  FDTD/FVTD  such  suability  analyses  have  not 
been  obtained.  So  far  we  still  rely  on  our  knowledge  of  the  stair-casing  FDTD  to  guide  us  on  the  suability  quesuon. 
However,  the  big  advantage  of  DSl  and  the  FDTD/FVTD  is  that  they  arc  explicit,  requiring  no  inversion  of  a  large 
matrix. 

There  is  a  very  successful  time  domain  conformal  technique  introduce  from  computational  fluid  dynamics  by 
Shankar  and  his  associates  in  Rockwell  [37,  38J.  The  Maxwell's  equations  are  cast  in  conservative  form  and  a  finite 
volume  discretization  is  employed.  Grids  used  in  hydrodynamic  calculations  arc  used  for  their  calculations. 

Conclusion 

From  the  above  exposition,  it  is  seen  that  the  conformal  time  domain  numerical  solution  of  Maxwell’s  equations  is 
a  very  active  field.  Not  only  the  FDTD-Iike  algorithms  have  advanced,  but  also  other  similar  methods  as  well. 
Furthermore,  the  division  of  finite  difference  and  finite  element  methods  seems  to  be  artificial  and  is  narrowing.  If 
Cangcllaris  and  Mci  had  made  use  of  the  finite  volume  lime  domain  technique  in  1984,  they  would  have  derived 
results  very  similar  to  our  FVTD  and  their  point  matched  finite  element  method  would  have  been  simplified;  and  if 
Yee  had  known  the  Whitney  elements  (they  existed  since  Whitney  published  his  book  in  1957  [39])  in  1984,  he 
would  have  obtained  results  very  similar  to  those  by  Mur,  Lee,  Mitma,  and  others  in  an  explicit  manner. 
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Fig.  4  Two  Early  2-D  Irregular  Grids  for  Internal  e/m  Problems: 

A.  A  Grid  for  the  Interior  of  a  Circular  Cylinder  ( Yec,  1984); 

B.  The  Madscn-2olkowski  Grid  of  Grid  A. 


Fig,  5  Three  Irregular  Grids  for  a  Circular  Cylinder: 

A.  Locally  distoned  grid;  the  outer  part  of  this  grid  is  rectangular  (Holland,  1992) 

B.  Unstructured  locally  distorted  grid  (Holland,  1992) 

C.  Structured  non-orthogonal  grid  (Chen-Ptodan-Yec,  1993) 
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Fig.  9  Rectangular  Grids  with  Special  Adjuament  ai  the  Curved  Boundary:  A,  B  (Taflovc, 1987;  Jurgen 
Taflove,  1993);  C,  D  (Fang-Ren,  1993  with  16  cases) 


•  An  electric  element  (primary)  on  the  vertioes  of  which  the  electric  vector  are  located. 

•  A  magnetic  (dual)  element  on  the  vertices  of  which  the  magnetic  vector  are  located. 


Fig.  10  The  Triangular  Prianic  Confonnal  Body  Grid  (Yce-Chen.  1992) 


A  rectangular  grid  overlapping  a  confonnal  body  grid 


nadiation  boundary  ~ 


Near  field  to  far  field  - 
extrapolating  surface 


Inner  boundary  of  the  - 
rectangular  grid 


Curvilinear  body  grid  - 
Outer  boundary 


Q 


Fig.  i  1  The  Overlapping  Caiforroal  FDTD/FVTD  Grid  (Yee-Chen.  1992): 

A  body  grid  consisting  of  four  to  five  layers  of  prisms  erected  along  the  surface 
normal  and  an  outer  rectangular  grid. 
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■  From  rectangular  to  body  grid 

Make  use  of  the  rectangular  coordinates 
of  an  electric  vertex  in  the  body  grid  in 
order  to  interpolate  from  the  rectangular 
components  of  the  electric  field  in  the 
rectangular  grid. 


[14- 
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From  the  body  grid  to  the 
rectangular  grid 

For  a  rectangular  field  point  determine 
which  electric  element  it  belongs  and 
where  in  that  element  it  is  located. 


■■■■■■■■■■■■■■ 

K;7iiB  *  ASM 

*iin 

Sri?  imtm 

VJtr*  jri»«r 


Fig.  1 2  The  Outer  Boundary  of  the  CcmformaJ  Grid  and  Inner  Boundary  of  the  Rectangular 
Grid.  The  data  at  these  boundaries  are  obtained  through  interpolations. 


Electric  Field 
component 


O  magnetic  Field 
Component 


Fig.  I3a  FDTD  Rectangular  Field  Components:  The  variables  are  updated 
with  the  FDTD  algorithm. 


•  a,b\ 

are  the  unit  area  vectors  of  the  area  yedors 
enclosed  by  the  magnetic  contours  linking  the 
various  edges  respectively. 

are  calculated  with  FDTD 

•  Define 

•  Approximate 

C*f(a)=-|q  ‘ffcil  +  fj  •£(C2)-*'C3*£(c3)+q*£(t4)} 

•  Using  a*  E{a).B»  E{a),C^E(a) 

to  determine  £(«> 

Fig.  13b  Non-(Rthogonai  FDTD  Variable  Locatioiu:  "TTie  variables  are 
updated  with  FDTD  and  averaging. 
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Fig.  14  The  Leapfrog  Updating  Scheme  widi  Whitney  Elements  fm-  2-D  TM 
Waves:  The  Whitney  "face”  elements  are  used  as  interpolates  for  the 
magnetic  vector  from  the  nonnal  "bcc"  components. 


Whitney  Elements  Time  Domain  (WETD)  Methods  for 
Solving  Three-Dimensional  Waveguide  Discontinuities 

.Hit Fa  Lee 
ECE  Dept.,  WPI 
Woirester,  MA  01609 


Abstract 

hi  this  ])aper.  we  present  an  unconditionally-stable  finite  element  time  domain  (FFTD)  method 
nsins  edge  elemenfs  for  modeling  three-dimensional  waveguide  discontinuity  [)rol)lems.  hhdike 
the  FDTD  methods,  the  current  api)roacli  can  be  used  in  conjunction  with  unstructured  meshes, 
particula.riv.  tetrahedral  finite  ehmients.  Furthermore,  since  it  is  unconditionally  stable,  it  oflers 
great  advantages  in  handling  lAM  problems  with  element  sizes  in  the  discretization  vary  several 
orders  of  magnitude  across  the  problem  domain.  The  time  step  is  determined  by  the  accuracy 
rather  than  by  the  stal)ility  as  in  the  conditionally  stalde  algorithms.  Mort'over,  the  compu¬ 
tational  comple.xitv  of  tlie  proposed  FETD  algorithm  is  analyzed  a.nd  coiifii'med  by  uumeiical 
ex|j('i'iments. 


1  Introduction 

The  firule  difference  t.iiue  doiiiain  (FDTD)  algorithm  has  been  used  widely  m  solving  transient 
responsi’s  of  electromagnetic  problems.  However,  using  FDTD  in  it.s  original  lorm,  it  is  diflicult. 
lo  model  com|)lex  EM  problems  with  curved  surfaces.  Many  variants  have  been  pro[)Osed  in  tin' 
liast  with  tin’  aim  to  circumvent  this  difficulty  with  varying  degrees  of  success  [1].  Almost  all  of 
these  api)roaclies  are  bas(>d  upon,  one  form  or  the  other,  the  use  of  finite  difference  api)roximation 
in  both  spatial  and  temi)oral  domains.  It  is  the  purpo.se  of  this  paper  to  formulate,  using  Whit  ney 
elements  for  solving  Maxwell's  eiiuations.  In  this  way,  the  proposed  FETD  method  can  be  used 
oil  a  tetralK>dral  finite  elennmt  mesh  and  coinsequently,  they  imiiose  no  geometric  limitations. 

d'he  approach  pro[)osed  herein  is  an  uncondil.ionally  stable  finite  element,  time  domain  method, 
using  only  I  he  edge  elements  or  Whitney  1-forms  in  the  s[)atial  domain.  Since  it  is  unconditionally 
stablm  it  otters  great  advantages  in  handling  EM  problems  with  element  sizes  in  the  discretization 
vary  sevmal  ordms  of  magnitude  across  the  i)roblem  domain.  In  t.hose  situations,  the  time  steps 
arc'  no  longer  limited  by  the  smallest  element  size  as  in  tin'  conditionally  stable  algorithms, 
(’onsequeiitly,  a  much  lugger  time  step  could  be  employed  in  the  numerical  simulation.  However, 
the  price*  [raid  for  is  the  need  for  matrix  inversion  for  ea.ch  tinu'  ste|),  and  conse(|U('uf ly,  the 
comiuitational  complexity  at  worst  case,  could  be  proportional  to  ./V'"’,  where  A'  is  the  number 
of  unknowns. 

The  rc'st  of  this  pairer  is  organized  as  follow.  Section  2  pre.sent.s  the  lormulatioii  ol  tin' 
niiconditionally  stable  FEd'D  algorithm,  its  com[mtational  comirlexity. is  analyzed  in  section  3. 
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Nnnu'ricai  resnlt.s  of  f.wo  ('xani])l('s,  an  T  junct.ion  and  a  rectangular  waveguide'  to  coaxial  cable 
converte-r,  are  shown  in  section  d. 


2  Formulation 


Ibis  j)ai)ei',  we  consider  tlie  solution  of  Maxwell’s  equations  in  space-time 

Vx/.,  =  ^ 

V  X  fl  =  J+  (  ^ 
c)i 


(1) 


We  shall  derive  an  implicit  finite  element  time  domain  (FET'D)  approach  which  is  uncondition¬ 
al!)'  stable  to  solve  (1 )  based  on  tlie  use  of  edge  elements. 

Faedo-Galerkin  Process 

For  the  derivation  of  the  implicit  FETD  algorithm,  we  start  with  an  initial  valne  probh'in 
(IVP)  in  terms  of  tlu'  ('lectric  field  E  as 

_  n  ^rO'E  dJ  . 

V  X  —V  X  T  +  =  -FunTT 

7,  D/F  at 


//, 


h  X  T’  —  0  on  r,, 
n  X  V  X  /?  =  0  on  F/, 

V7  r  ‘  r 

V  X  L  =  -  — -  on  1 

c  at 


where  F,  .F/,  are  electric  and  magnetic  walls,  respectively,  and  F.^.  is  the  truncation  boundary. 
Foi-  simplicity,  we  have  adopted  the  first-order  absorbing  bouiKlary  condition  (ABC)  on  F,x,  in 
order  to  truncate  t  he  infinit('  domain  into  a  finite  region. 

To  sim|)lify  the  derivation,  let  us  assume  that  J  =  0,  although  the  inclusion  of  ./  in  the 
derivation  is  rat  h('r  straightforward.  The  weak  or  the  Gakrkin  form  of  the  IVP  (2)  can  be  stated 


-V  X  V 


■hi  la 


I  V  X  Edil  +  <p  -r  • 
■n'x.  f 

(E  E 


dji 

at 


dV 


(:J) 


+  1  f  e,.e.^7/D  =  0 

r**  .In  ()l^ 

for  any  test  vector  funct  ion  r.  Ity  choosing  edge  elements  to  span  both  the  trial  and  test  I  unction 
space's,  the  application  of  the  Faedo-Calerkin  [)roce.ss  results  in  a  system  of  ordinary  differential 
ecjuations  (OI)Es) 


(d) 
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where  S  is  the  eoelfieient  vert.or,  and 
|7’]„  =  £<,11- .ir,rf<! 

m,,  =  ( 

(S'l  =  /  —V  X  ir,  .  V  X  R'//!!  i'l] 

JU  fly 

The  g('iiei-a!  1  Troe-poinf  recurrenee  sehenies  for  second  order  efiuations  liave  Ix-en  doriiinenied  in 
[2].  Applying  I  hem  to  llie  ODFis  (1),  the  result  is 

+  {^P'l  +  (i  -■a')A[/j]  +  (i_2/7  +  7)d/'^|.‘;]}r’ 

=  n'  (B) 

In  the  present  approach,  we  tiave  chosen  y  =  5  and  A  residling  in  tlie  following  difference 

equal  ion: 


where 

[P]  =  .lcA/[/i] 

[Q]  =  dcA/[/^]  + ‘Ir^A/“[,S'] 

[M]  =  -l[7’]  +  2cA/[e]-hc^A/'l.s']  (H) 

The  proof  of  the  stalhlity  condition  using  the  Z-transforin  t<>chni(iue  can  be  found  in  Hef.  [3], 

3  Computational  Complexity 

In  lids  s('ctiou.  we  shall  brielly  discuss  the  eoinput.at  ional  coTiiph'xdly  of  the  time-marching  sclu'iiu' 
l)roposed  in  (7).  Assuming  the  physical  time  is  fixed,  the  total  CPU  time  can  be  ex])ressed  as 

Trpu  —  (#time  ste[)s)  X  7T,,,  (11) 

where  7’.sf,,,  1^^  *  h<'  average  CPU  tinu'  to  U|)date  the  electric  field  Irom  time  uAt  to  (n  +  i)A/.  It 
is  clear  from  (7)  that  foi'  each  time  step,  a  matrix  equation  of  the  form 

[,M].r  = //  (111) 

1260 


needs  l.o  lx*  s()lv<‘(l.  Fortunately.  [M]  is  j)ositive  definite  and  (10)  ran  he  solved  efficiently  using 
the  |M-cronditloned  conjugate  gradient  (FCXX:)  method.  Suhse(|itent ly,  we  write 

T,t,p  oc  lien  X  X  ^ 

where  /V  is  tin'  dimension  of  the  matrix  [M].  Equation  (11)  is  obtained  simply  because  each 
(X,l  iteration  involves  notdiiug  but.  matrix  and  vector  multi])lications.  Moreover,  it  can  be  shown 
that  with  the  diagonal  preconditioncr  [4],  we  have 

o<,„<,  (u) 

where  /.([X])  is  t  he  condition  number  of  matrix  [.4].  In  Eq.  (12),  [D]  is  the  diagonal  iK)rtion  of 
[.Ad]  and  h  is  the  element  size. 

Furthcrmoian  in  t  he  aijplication  of  the  CG  metliod,  it  has  been  shown  that 


<11  .r  -  .r,j 


yX-  1 
yX  +  1 


witli  11  .1  -  :r„  II  is  the  error  for  t.he  nth  iteration,  and  ||  x  —  j|  the  error  for  the  initial  guess 
,ru.  (  ’ons<'(juently,  t,he  numl)er  of  iterations  ?yr;  for  PCCG  to  converge  can  be  roughly  e.stimated 


/vV+iW, 

h  \ 

2  ' 

”  II  li  ) 

where  h  is  t,he  toh'rance  set  for  convergence.  Therefore,  w(‘  have 


Putting  everything  together  and  assuming  t  hat  we  choose  At  oc  h  results  in 

«  (#t.ime  steps)  X  UCYV  X  .'V 
1  1  1 

■oc  oc 

Compared  to  the  FDTI),  whose  computational  complexity  is  the  current  FETD  is  slightly 

less  efficient.  However,  as  mentioned  earlier,  the  unconditionally  stable  nature  of  the  current  for- 
rmilation  mahes  it  (’xtremely  appealing  for  modeling  problems  with  very  small  features. 
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Figure  1:  eleiiioii(  mesh  For  a  T-junction. 

4  Numerical  Results 

'Two  waveguide  diseout inuit.ies  are  analyzed  by  using  tlie  current  I'T^TD  approach.  One  is  a  I’ 
jnnrfioii  atid  (he  other  is  a  rectangutar  waveguide  to  coa.xial  cable  converter.  To  ptudorni  (he 
FF/T  I)  anaivsis.  the  tetrahedral  meshes  For  both  case.s  are  genei-a(ed  using  an  aul.o:na(.ie  nn'sh 
gc'neration  piogram. 

T  Junction 

Shown  in  Fig.  i  is  a  samirle  Finii.e  element  mesh  used  For  the  FFTD  program  to  analyze  a 
( liree-por(  T  junction.  Faeh  oF  (  he  port  is  a  rectangular  waveguide  with  dimensions  oF  lr??(  x  2nii. 
Several  meshes  were  created  and  us('d  For  (In'  analyses  to  charac( crize  (he  [)('rFormanc<'  oF  the 
FIFFD  Formulation.  The  characieristics  oF  the  meshes,  the  corresi)0!Krmg  time  steps  used  iii  the 
algorithm,  and  (he  total  CPU  times  are  shown  in  Table  I.  Figure  2  [)lots  the  C'PF!  t.inu'S  versus 
lli(‘  tola!  number  (rF  unknowns,  ll  is  Found  that,  the  eom[)u(  at  ioiial  eom[)le.xi(y  is  ap[)r(jximat  ely 
Irpti  ~  T’'  when*  l/d  <  u  <  d/2  which  is  consistent  with  Fq.  (16)-  Furthermore,  the  steady- 
state  field  distribution  oFthis  d’  junction  is  shown  in  Fig.  3. 


Table  I:  Statistics  of  FETD  algorithm  for  a  T  junction. 


N 

cSt 

7  epu  i 

1 1 91 

0.1199 

0.52916 

1  16.06 

1685 

0.1199 

0.1 73 81 

216.53 

1 883 

0.0816 

0.45506 

219,36 

3610 

0.0658 

0.37289 

619.0 

6309 

0.0591 

0.31190 

1  125.12 
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Figure  4:  The  geometry  of  a  rectangular  waveguide  to  coaxial  converter. 

Waveguide  to  Coaxial  Converter 

Another  example  that  we  have  analyzed  is  a  rectangular  waveguide  to  coaxial  cable  converter 
as  shown  in  Fig.  4.  By  using  the  finite  elements,  particularly  the  tetrahedral  elements,  the 
discretization  can  be  made  conforming  to  the  problem  geometry  conveniently.  Once  again,  the 
whole  process  of  discretization  is  done  automatically  by  using  an  automatic  tetrahedral  mesher. 
The  final  finite  element  mesh  which  is  used  in  the  analysis  is  shown  in  Fig.  5.  In  this  mesh, 
there  are  total  6524  elements,  and  the  smallest  element  length  is  0.1.  Since  the  formulation  is 
unconditionally  stable,  the  time  step  used  in  the  calculation  is  based  upon  the  average  element 
length,  which  corresponds  to  cSt  =  2.6.  A  much  bigger  time  step  than  would  be  possible  using 
the  conditionally  stable  algorithms.  Finally,  the  steady  state  field  distribution  for  frequency  at 
lOGHz  is  shown  in  Fig.  6. 
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An  FDTD/FVTD  2D-algorithni  to  Solve  Maxwell's  Equations 

Jei  S.  Chen,  John  V.  Prodan,  and  Kane  S.  Yee 
Lockheed  Palo  Alto  Research  Laboratories 
Palo  Alto,  California 

Abstract^  recently  devised  finite  difference  and  finite  volume  time  domain  hybrid  scheme  [1],  has  simplified 
the  process  for  calculating  the  electromagnetic  scattering  for  a  large  class  of  2-D,  exterior  volume,  scaitenng 
problems.  The  computational  grid  is  conformal  to  the  object.  The  class  of  problems  we  address  in  this  paper  is 
the  scattering  of  an  incoming  plane  Gaussian  pulse  by  various  2-D  objects,  either  in  free  sp^^above  a  ground 
plane  It  is  the  purpose  of  this  paper  to  review  the  hybrid  finite  difference  ume  domain  (FpTD)  and  the  finite 
volume  ume  domain  (FVTD)  algorithms  in  a  2-D  seuing  and  to  show  how  relauvely  easy  it  is  to  generate  the 
conformal  grid  to  which  we  can  apply  our  time  domain  algorithm.  In  addiuon,  examples  of  the  kinds  ot 
calculations  that  can  be  done  with  this  code  for  this  class  of  objects  will  be  presented. 

Introduction  u  t. 

The  technique  of  conformal  time  domain  calculation  m  electromagneucs  is  not  very  old.  It  started  with 
a  paper  by  R.  Holland  [2]  on  the  time  domain  discretization  of  Maxwell’s  equations  in  general  non-ormogonal 
coordinates.  The  application  of  the  surface-curve  integral  form  of  the  Maxwell's  equations  to  derive  ^fference 
equations  [3  4, 5],  instead  of  the  FDTD  in  rectangular  form,  gives  the  freedom  of  numerically  solving  MaxweUs 
equations  on  a  grid  without  an  explicit  coordinate  system,  The  use  of  the  integral  forms  (^ace-cu^e  and 
volume-surface)  of  Maxwell’s  equations  was  exploited  by  Madsen  and  Ziolkowski  [6],  The  ume  domain 
conformal  techniques  are  now  more  developed  [6, 7,  8],  however  3-D  conformal  calculauons  ^e  still  not  widely 
used  (In  fact,  most  of  the  conformal  calculations  are  in  2-D  [5,  9,  10]).  Recently,  Holland  [11]  gave  a  very 
detailed  discussion  of  the  2-D  conformal  calculation  based  mainly  on  the  surface-curve  integral  form  ot 
MaxweU’s  equations.  A  glance  at  the  content  of  that  paper  shows  that  even  the  2-D  conformal  calculations  are 
far  from  trivial.  This  complication  of  the  2-D  conformal  calculation,  however,  seems  to  be  subs^aUy  eased 
with  the  newly  advanced  FDTD/FVTD  hybrid  [1].  In  fact,  since  the  announcement  of  the  FDTD^TD  m  the 
1993  PIERS  conference  [12],  the  2-D  FDTD/FVTD  have  been  found  to  be  simple  to  implement  and  yields  g^ 
results  for  some  appIicaUons  [13.  14].  In  this  paper  we  specialize  our  3-D  FDTD/FVTD  trchmque  fof  ^'D 
calculaUons.  It  will  be  shown,  with  examples,  that  it  is  now  quite  easy  to  do  conformal  2-D  ume  dornain 
calculations.  The  calculation  is  further  simplified  with  the  substitution  of  the  traditional  radiauon  boundary 
condition  (RBC)  by  a  newly  discovered  tapered  damping  technique  near  the  outer  computational  bounda^. 

A  brief  discussion  of  the  general  relations  on  which  the  FDTD  and  FVTD  algonthms  are  based 
given  as  well  as  the  general  philosophy  for  creating  the  grid.  The  2-D  relationships  needed  for  the  TM  and  TE 
cases  will  then  be  developed.  (For  the  2-D  case,  all  the  field  variables  are  independent  of  the  z-coordinate.)  A 
somewhat  expanded  descripUon  of  the  boundary  conditions  for  the  TE  case  will  be  presented  since  they  are  a 
liule  more  involved  than  those  for  the  TM  case.  In  addition,  we  have  implemented  a  new  radiauon  Iwundary 
condiuon  which  will  be  briefly  described.  Computational  results  for  three  "representative"  objects  will  be  given 
for  the  TM  case.  In  2-D  problems,  where  computer  memory  is  not  a  serious  concern,  one  can  easily  generate 
ume  and  spatial  displays  of  field  values  and  energy  densities  as  well  as  cross  secuon  data.  Elecuomagnetic 
energy  density  maps,  at  two  different  lime  steps,  for  a  s-duct  ^d  a  faceted  object  sitting  above  a  ground  plane 
are  presented  along  with  the  RCS  calculauons  for  a  circular  cylinder. 

The  FDTD  and  the  FVTD  relations  ,  .  _ 

There  are  four  relationships  that  form  the  core  algorithms  for  the  FDTD  and  FVTD  techniques,  ^ese 
are  the  line-surface  integral  and  surface-volume  integral  forms  of  both  Ampere's  and  Faraday's  laws,  given  by; 


-[  B  -da  =  f 
Ja 


Faraday's  Law  (line-surface) 


•  da  =  f  H  -dl 

^aA 


Ampere's  Law  (line-surface) 
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A 

-f  B  dv=  f  n  x  E  da 

Faraday's  Law  (surface-volume) 

(2a) 

f  D  dv  =  r  n  X  H  da 

Jv  ^av 

Ampere's  Law  (surface- volume) 

(2b) 

We  refer  to  the  line  integral  in  equation  (la)  as  the  electric  circulation;  the  line  integral  in  equation  (lb)  as  the 
magnetic  circulation;  the  surface  integral  in  equation  (2a)  as  the  electric  vorticity;  and  the  surface  integral  in 
equation  (2b)  as  the  magnetic  vorticity.  The  (generalized)  FDTD  algorithm  is  developed  from  the  discretization 
of  Equations  1,  while  the  FVTD  algorithm  is  based  on  the  discretization  of  Equations  2.  In  specializing  to  2 
dimensions,  we  take  the  volume,  in  Eqs.  2,  to  be  prisms  with  height  equal  to  Az,  and  the  trace  of  a  prism  in  the  x- 
y  plane  to  be  a  curve. 

Grid  construction 

A  conformal  grid  is  used  to  model  the  objects,  eliminating  the  concerns  that  rise  when  a  rectilinear 
staircasing  grid  is  used.  While  the  grid  is  conformal,  it  is  not  necessarily  orthogonal.  Examples  of  some  objects 
are  shown  in  Figures  la-ld,  together  with  the  possible  grids.  Figure  la  depicts  the  grid  for  a  PEC  ogive;  Fig.  lb 
shows  a  grid  for  an  s-ducq  the  grid  for  a  PEC  circular  cylinder  is  shown  in  Fig  Ic;  and  Fig.  Id  shows  a  grid  for  a 
faceted  PEC  object  sitting  above  a  ground  plane. 

For  the  most  part,  the  grids  are  formed  by  translating  (either  horizontally  or  vertically)  sections  of  the 
objects'  boundary  curve.  In  some  cases,  e.g.,  between  the  object  and  the  ground  plane  in  Fig.  Id,  instead  of  a 
pure  translation,  the  boundary  curve  can  be  extended  from  the  object  to  the  ground  plane  (which  is  also  the 
computational  boundary)  by  scaling  the  grid  from  the  side  of  the  object,  so  that  the  number  of  cells  would  be  the 
same.  (Care  needs  to  be  taken  to  ensure  that  the  Courant  condition  is  still  satisfied  for  these  "reduced  size" 
cells.)  The  grid  should  be  constructed  such  that  the  slope  of  the  grid  lines  where  the  conformal  part  "transitions" 
to  the  rectangular  grid  is  not  excessive  (-45*  or  less  is  advised).  The  reason  for  this  requirement  is  to  try  to 
maintain  accuracy  in  the  updating  of  the  electric  field.  The  electric  field  vector  is  assigned  at  the  vertices  of  the 
cells,  while  the  magnetic  vector  is  assigned  at  the  cell  center.  (Figure  le  shows  the  positions  of  the  field  nodes  in 
a  blow-up  of  the  upper  right  section  of  the  grid  shown  in  Fig.  Ic.)  To  update  the  electric  field,  either  the 
magnetic  circulation  or  vorticity  is  needed,  depending  on  the  polarization.  If  the  transition  slope  is  too  large,  the 
centroid  of  the  contour  (formed  by  connecting  the  open  circles  in  Fig.  le)  used  to  calculate  the  magnetic 
circulation  (needed  to  update  the  electric  field  in  the  TM  case)  will  not  coincide  with  an  electric  vertex,  (For  the 
TE  case,  it  is  the  centroid  of  the  volume  about  which  the  magnetic  vorticity  is  calculated  that  should  to  be 
coincident  with  the  electric  vertex.)  If  near  coincidence  can  be  maintained,  then  numerical  accuracy  of  the 
update  will  be  of  2^’^  order,  otherwise  only  1^1  order  accuracy  is  achieved. 

The  TM  case 

For  the  TM  case,  the  2-D  fields  consist  of  the  following:  the  z-component  of  the  electric  field,  Ez,  and 
the  X-  and  y-components  of  the  magnetic  field,  Hx  and  Hy.  All  the  field  components  are  only  functions  of  x,  y, 

and  f 

The  electric  nodes  are  located  at  the  grid  points  (i  j),  with  coordinates  given  by  (vex(i J),vey(i J)).  The 
magnetic  nodes  are  located  at  (i+l/2J+l/2),  with  coordinates  given  by  (vmx(i+l/2  j+1/2),  vmy(i+l/2  j+1/2)).  To 
update  the  electric  field  at  an  electric  node  (i  j)  (see  Fig.  2a),  the  magnetic  circulation  is  calculated  and  the  FDTD 
algorithm  is  used. 

The  z-direction  magnetic  circulation,  cirm(i  j),  is: 


H(jl)  •  (vmx(i+l/2J-l/2)  -  vmx(i-l/2j-I/2),  vmy(i+l/2j-l/2)  -  vmy(i-l/2J-l/2))  + 

H(i2)  •  (vmx(i+l/2J+l/2)  -  vmx(i+l/2J-l/2),  vmy(i+ 1/2 J+1/2)  -  vmy(i+l/2J-l/2))  + 

— > 

H02)  •  (vmx(i-l/2j+l/2)  -  vmx(i+l/2,j+l/2).  vmy(i- 1/2 J+1/2)  -  vmy(i+ 1/2 J+1/2))  + 

H(il)  •  (vmx(i-l/2J-l/2)  -  vmx(i-l/2 J+1/2).  vmy(i-l/2J-l/2)  -  vmy(i- 1/2 J+1/2))  (3) 

To  update  the  magnetic  vector  at  the  magnetic  node  (i+l/2J+l/2),  the  electric  vorticity  is  needed  and  the 
FVTD  algorithm  is  used.  For  the  required  volume,  we  take  a  prism  of  height  Az,  whose  cross-section  is  shown 
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in  Fig.  2b  .  The  relation  between  the  unit  outward  (surface)  normal  n,  the  unit  tangent  vector  t  ,  and  the  unit 
vector  along  the  z-directionz,  is  given  by:n  X  t  =  z.  These  three  vectors  n,  t  and  z  form  a  right  handed 
orthogonal  triad.  The  finite  volume  algorithm  is  applied  to  this  prism  in  order  to  update  the  magnetic  vector  at 
the  magnetic  node  (i+l/2J+l/2).  The  contribution  from  the  top  and  the  bouom  surfaces  cancel.  Denoting  the 

length  of  a  side  by  5,  the  area  vector  of  a  lateral  surface  is  given  by:  A  =  n  A  =  n5Az.  The  electric  vorticity 
at  the  lateral  surfaces  is  found  by  computing  A  x  E  for  each  of  the  sides  as  follows: 

A(jl)xE01) 

=  A(jl)n  X  (Ez  z  )  =  -AzSEzt  =  -AzEz(5t) 

=  -AzEzCjl)  {x(vex(i+l  J)  -  vex(ij))  +  y{vey(i+l  j)  -  vey(i  j)))  (4a) 

similarly, 

A(i2)  X  E  (i2) 

=  -AzEz(i2)  (x(vex(i+lj4-l)  -  vex(i+l  j))  +  y(vey(i+l  J+1)  -  vey(i+l  J)))  (4b) 

A(j2)xEa2) 

=  -AzEz02)  {x(vex(ij+l)-  vex(i+lj+l))  +  y(vey(ij+l)  -  vey(i+l  j+1))}  (4c) 

A(il)x^(il) 

=  -AzEz(il)  {x(vex(i  j)  -  vex(ij+l))  +  y(vey(ij)  -  vey(ij+l))}  (4d) 

The  electric  vorticity,  ^ore‘’(i+l/2j+l/2)  over  the  lateral  faces  is  the  sum  of  the  above  four  expressions. 

Once  the  electric  vorticity  is  known,  the  magnetic  vector  can  be  updated  by 

^"■^^^(^1/20+1/2)=  H"’'^(i+l/2J+l/2)-j^‘?oref'(i+l/2.j+l/2)  (5a) 

Similarly,  knowing  the  magnetic  circulation,  the  z-component  of  the  electric  field  can  be  updated  by: 
E"(i,j)+^cirm"^^''2(iJ)  (5b) 

In  the  above  two  expressions,  Av  is  the  volume  of  the  prism,  i.e.,  the  area  shown  in  Fig.  2b  multiplied  by  the 
height  Az;  Aa  is  the  area  enclosed  by  the  JM2-J2-I1  loop  shown  in  Fig.  2a.  The  superscripts  indicate  the  time 
step  index. 

The  TE  case 

For  the  TE  case,  the  2-D  fields  consist  of  the  following:  the  z-component  of  the  magnetic  field,  Hz,  and 
the  X-  and  y-components  of  the  electric  field.  Ex  and  Ey.  As  before,  these  field  components  are  only  functions  of 
X,  y,  and  L 

To  update  the  magnetic  field  at  a  magnetic  node  (i+l/2j+l/2),  the  electric  circulation  is  computed  and 
the  FDTD  algorithm  is  used.  Referring  to  Fig.  3a  ,  the  z-dirccUon  electric  circulation,  circ(i4-l/2J+l/2),  is; 

E  (jl)  •  (vex(i+l  J)  -  vex(ij),  vey(i+l  J)  -  vey(i  j))  + 

— > 

E(i2)  •  (vex(i+lj+l)  -  vex(i+lj),  vey(i+l,j+l)  -  vcy(i+lj))  + 

— > 

E(j2)  •  (vex(ij+l)  -  vex(i+l  J+1),  vey(ij+l)  -  vey(i+lj+l))  + 

— > 

E(il)  •  (vex(ij)  -  vex(ij+l),  vey(ij)  -  vey(ij+l))  (6) 

To  update  the  electric  vector  at  the  (interior)  electric  node  (i  J),  FVTD  is  used  and  the  magnetic  vorticity 
is  required.  For  the  relevant  volume,  we  take  a  prism  of  height  Az,  whose  cross-section  is  shown  in  Fig.  3b.  As 
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before,  the  unit  outward  normal  n  ,  the  unit  tangent  vector  t ,  and  the  unit  vector  along  the  z-direction  z  ,  form 
a  right  handed  orthogonal  triad  and  are  related  by  n  x  t  =  z.  The  finite  volume  algorithm  is  applied  to  this 
prism  in  order  to  provide  the  update  of  the  electric  vector  at  the  electric  node  (ij).  The  contribution  from  the  top 
and  the  bottom  surfaces  cancel.  Again,  denoting  the  length  of  a  side  by  6,  the  area  vector  of  a  lateral  surface  is 

given  by  A  =  nA  =  d5Az.  The  magnetic  vorticity  at  the  lateral  surfaces  is  found  by  evaluating  A  x  H  for 
each  of  the  sides  as  follows: 

A01)xH(jl)  =  A(jl)nxH2ai)z 
=  -AzSHxOl)  t  =  -AzHzGl)  (5  t) 

=  -AzHzOl)  {  ^vmx(i+l/2j-l/2)  -  vmx(i-l/2j-I/2)]  + 

y[vmy(i+l/2j-I/2  -  vmy(i-l/2j-l/2)]};  (7a) 

similarly, 

— > 

A(i2)xH(i2) 

=  -AzHz(i2)  {  ^vmx(i+l/2j+l/2)  -  vmx(i+l/2j-l/2)]  + 

y[vmy(i+l/2j+l/2  -  vmy{i+l/2J-l/2)]},  (7b) 

A  (j2)  x  Ha2) 

=  -AzHz(j2)  (  x[vmx(i-l/2J+l/2)  -  vmx(i+l/2J+l/2)]  + 

y[vmy(i-l/2j+l/2  -  vmy(i+l/2j+l/2)]),  (7c) 

-> 

A(il)xH(il) 

=  -AzHz(il)  {  ^vmx(i-l/2j-l/2)  -  vmx(i'l/2j+l/2)]  + 

y[vmy(i-l/2j-l/2  -  vmy(i-l/2j+l/2)]}.  (7d) 

The  magnetic  vorticity,  ^orm'^+^/7(j  j)  over  the  lateral  faces  is  the  sum  of  the  above  four  expressions. 

Once  the  electric  circulation  is  known,  we  can  update  the  z-component  of  the  magnetic  field  using 

H;-''^(itl/2o+I/2)-j^cire"(i+l/2j+l/2)  (8a) 

and  knowing  the  magnetic  vorticity,  the  electric  vector  can  be  updated  by 

E  ""^Vi  j)  =  E  "(iJ)  +  —  7orm"+^^(iJ).  (8b) 

In  the  above  two  equations,  Aa  is  the  area  enclosed  by  the  contour  J1-I2-J2-11  shown  in  Fig.  3a  and  Av  is  the 
volume  of  the  prism  of  height  Az  and  base  shown  in  Fig.  3b. 

PEC  boundary  condition  simulation  for  the  TE  wave 

The  boundary  condition  simulation  for  the  TE  wave  is  somewhat  more  complicated  than  the  case  for  the 
TM  wave,  since  electric  nodes  are  located  at  the  boundary  of  the  scauerer.  The  magnetic  field  at  the  point  P  (see 
Fig.  4)  is  updated  with  the  FDTD  algorithm.  At  the  points  1  and  4  ,  only  the  tangential  components  of  the 
electric  field  are  known  (through  the  boundary  condition).  In  order  to  get  the  electric  circulation  along  the  edge 
2->l  and  4-^3,  the  component  of  the  electric  field  along  those  edges,  at  the  midpoints,  is  needed.  (In  the 
following,  the  electric  fields  are  evaluated  at  the  time  step  nAL)  ^ 

We  obtain  the  component  along  each  of  these  edges  in  the  following  manner.  Let  t  now  be  the  unit 
tangent  along  2-^1,  N  be  the  unit  normal  at  the  point  1  and  T  the  unit  tangent  along  the  surface  at  point  1.  From 

A  A 

the  boundary  condidon,  the  tangential  electric  field,  T  •  E  (1)  is  known.  The  component  of  the  field  along  Hj  , 

the  unit  normal  to  the  line  PQ  ,  is  given  by  nj  •  E  (a)  and  can  be  obtained  by  means  of  the  FDTD  algorithm. 
Approximadng  the  electric  vector  at  a  by  the  average  of  the  fields  at  points  1  and  2,  we  have 
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E(a)*T=l/2[E(l)+E(2)]*  T.  (9a) 

Breaking  E  (a)  into  components  tangent  and  normal  to  the  surface  results  in 

E  (a)  =  T  [E  (a)  •  T]  +  N  [N  •  E  (a)].  (9b) 

A 

and  then  forming  the  scalar  product  of  (9b)  and  we  have 

Ss  •  Eia)  =  [ns  .  T  ]  [  IE  (a)  •  T  ]  +  [  ^  tN  •  E(a)]  (9c) 

Solving  for  the  component  of  the  field  along  the  normal,  this  final  expression  can  be  rewritten  as. 

N.E(a)  =  r^{n%  •E(a)-(Ss  •  T][  E(a)*T]}.  (9d) 

Ds-N 

Equation  (9d)  is  the  relation  that  is  needed  to  find  the  component  of  E  along  2-^1  (i.e.,  in  the  t  direction). 
Dotting  t  into  Eq.  (9b)  results  in 

t  *^(3)  =  [t  •  T1  [  E  (a)  •  T  ]+  [  t  •  [N  •  E  (a)]  (10) 

where  the  right  hand  side  can  now  be  evaluted.  A  similar  procedure  is  used  to  generate  the  component  or  me 
field  along  4— >3. 

Outer  boundary  condition  simulation 

Until  quite  recently,  all  the  radiation  boundary  simulations  near  the  outer  boundary  of  computauon 
volume  were  based  on  the  outgoing  behavior  of  the  scattered  field.  The  most  popular  radiation  boundary 
condition  (RBC)  approximations  in  electromagnetic  calculations  are  the  Mur  RBC  [15]  or  the  Liao  RBC  [16]. 
We  have  discovered,  however,  a  rather  robust  approximation  near  the  outer  boundary  to  simulate  artificial 
damping  of  the  scattered  field  in  the  neighborhood  of  this  boundary.  We  called  this  technique  "tapered  damping" 
and  the  mathematical  details  describing  this  method  are  given  in  [17]. 

The  procedure  needed  to  implement  this  technique  is  very  simple.  The  outer  boundary  of  the 
computational  volume  in  our  problem  is  roughly  one  to  two  wavelengths  away  from  the  smallest  box  containing 
the  scatterer,  approximately  the  same  distance  away  as  when  the  Mur  RBC  is  used.  Several  layers  next  to  the 
outer  boundary  are  introduced  (see  Figs.  5a  and  5b  )  and  labeled  zones  1  through  N.  Extending  across  these 
zones,  we  define  a  tapered  function: 

Tab(ibd)  =  cos((jr/3)(N+l-ibd)/N),  (11) 

where  ibd=l,2 . N  is  the  zone  number  from  the  outer  boundary.  In  addition,  a  function  tap(ij),  defined  over 

the  entire  computational  space,  is  introduced  such  that: 

tap(i  j)  =1  if  the  point  (i  j)  does  not  belong  to  any  of  the  zones 
lap(i  j)  =  Tab(ibd)  if  the  point  (i  j)  belongs  to  the  zone  ibd. 

The  function  tap(i  J)  lakes  on  nonnegative  values  less  than  or  equal  to  1 .  (The  above  particular  choice  of  Tab,  is 
not  meant  to  imply  that  this  is  the  only  one  that  should  be  used.  Many  such  functions  could  be  constructed  (such 
that  their  values  taper  smoothly  from  1  to  a  value  less  than  1)  that  would  probably  serve  just  as  well.  Extensive 
investigation  has  yet  to  be  done  to  see  if  an  "optimum"  function  exists.) 

The  calculational  procedure  is  as  follows: 

1.  Set  the  values  of  the  scattered  electric  vector  located  at  the  outer  boundary  to 
be  zero  and  never  update  them. 

2a.  Update  the  scaUered  magnetic  field  variables  by  the  appropriate  algorithm 
(i.e.,  either  FDTD  or  FVTD)  for  a  lossless  medium. 

2b.  Multiply  each  magnetic  field  variable  by  the  value  of  a  magnetic  tapered 
function  at  that  point 

3a.  Update  the  electric  field  variable  by  the  appropriate  algorithm  for  a  lossless 
medium. 

3b.  Multiply  each  electric  variable  by  the  value  of  an  electric  tapered  function  at 
that  point 
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In  the  limited  testing  that  we  have  done  using  this  damping  technique,  we  have  found  it  to  be  reasonably  robust 
(the  results  have  been  relatively  insensitve  to  the  particular  choice  of  taper  function)  and  certainly  much  simplier 
to  implement  than  the  Mur  RBC.  The  best  results  seem  to  be  obtained  if  the  magnetic  and  electric  tapered 
functions  are  the  same.  In  addition,  this  method  seems  to  be  more  amenable  to  efficient  implementation  on  some 
of  the  massively  parallel  computing  architectures,  than  is  the  Mur  RBC.  (This  is  due  to  the  fact  that  the 
application  of  the  technique  is  basically  Just  a  multiplication  of  the  array  holding  the  electric  or  magnetic  field 
variables  by  another  array,  a  process  that  is  very  efficient  in  parallel  machines.  No  special  ireaunent  needs  to  be 
given  to  the  values  near  the  outer  boundary.) 

Computational  results 

As  an  illustration  of  our  new  FDTD/FVTD  hybrid,  we  present  calculations  (TM  case)  for  three  different 
objects:  an  s-duct,  a  PEC  circular  cylinder  and  a  faceted  target  silting  over  a  ground  plane.  (See  Figs,  lb,  Ic  and 
Id,  respectively,  for  examples  of  what  the  grids  for  these  objects  would  look  like.)  The  calculations  were  carried 
out  with  the  tapered  damping  RBC,  with  N  =  10  (see  Eq.  11  above).  For  the  s-duci  and  the  faceted  object, 
energy  density  calculations  were  made  and  the  results  at  two  different  time  steps  are  shown.  For  the  circular 
cylinder,  we  present  the  RCS  calculation  along  with  the  theoretical  value. 

The  display  of  the  electromagnetic  energy  density  for  the  s-duct  at  two  different  time  steps  is  shown  in 
Figs.  6a  and  6b.  For  these  two  picuires,  the  gaussian  pulse  was  incident  from  the  lower  left  side  at  30*  from  the 
horizontal,  and  first  struck  the  duct  at  a  time  index  of  -  12.  In  Fig.  6a,  the  energy  density  at  a  (approximate)  time 
index  of  32  is  shown.  The  scauering  due  to  the  impedance  mismatch  of  the  duct  can  be  clearly  seen  as  well  as 
the  energy  that  made  it  into  the  duct  Figure  6b  shows  the  energy  density  at  a  time  step  of  -270.  (The  gaussian 
pulse  has  travelled  past  the  duct  by  time  step  -125.)  The  energy  that  made  it  inside  the  duct  has  now  had  time  to 
travel  the  length  of  the  duct  (ricocheting  off  the  walls),  reflect  off  the  back  wall  and  is  now  broadcasting  from  the 
inlet.  (There  is  still  some  energy  rattling  around  in  the  duct  for  another  200  time  steps!) 

Figures  7a  and  7b  show  the  energy  density  maps  for  a  gaussian  pulse  scattering  off  of  a  faceted  PEC 
object  that  is  sitting  over  a  ground  plane.  The  pulse  is  incident  from  the  upper  left  hand  side  of  the  picture  (60* 
from  the  horizontal).  (Since  this  is  a  TM  wave,  the  electric  field  vector  is  normal  to  the  page.)  Figure7a  shows  a 
snapshot  of  the  energy  density  approximately  8  lime  steps  after  the  pulse  first  strikes  the  object.  The  specular 
reflection  off  the  leading  facet  can  be  seen  (it  is  propagating  back  toward  the  upper  left),  as  well  as  the  diffracted 
energy  that  has  scattered  off  the  front  tip  and  leading  upper  "comer".  The  point  of  contact  between  the  pulse  and 
the  long  facet  of  the  body  is  also  quite  visible,  due  to  the  fact  that  the  magnetic  field  has  a  substantial  component 
parallel  to  the  objects'  surface  for  this  incident  angle.  (In  the  lower  left  comer,  the  point  of  contact  between  the 
incident  pulse  and  the  ground  plane  is  also  evident.)  The  energy  density  after  another  24  time  steps  is  shown  in 
Fig.  7b.  The  point  of  contact  between  the  primary  pulse  and  the  object  has  now  moved  to  the  "trailing"  comer, 
with  the  specular  relection  from  the  main  facet  clearly  visible.  (It  is  comforting  to  note  that  the  angle  of 
reflection  equals  the  angle  of  the  incidence!)  At  the  left  side  of  the  object,  we  can  see  that  the  wave  that  was 
reflecting  off  the  ground  plane  has  struck  the  front  lip  and  lower  facet.  In  addition,  the  energy  from  the  main 
pulse  that  had  scattered  off  the  upper  leading  facet  earlier,  can  be  seen  in  the  upper  left  (it  is  spreading  out  as  it 
propagates  away  from  the  body),  A  series  of  these  snapshots  can  be  (and  has  been)  put  together  to  form  a  movie 
depicting  how  the  pulse  is  scauered  off  various  parts  of  the  body.  (A  final  note  pertaining  to  this  object  is  that 
when  the  movie  was  viewed,  no  scattering  could  be  seen  coming  off  the  computational  boundary,  indicating  that 
this  new  RBC  is  doing  its  job!) 

The  results  from  a  bistatic  RCS  calculation  at  300  MHz  for  the  infinite  PEC  circular  cylinder  are  shown 
in  Fig.  8a  and  8b.  The  cylinder  has  a  radius  of  I  meter.  The  grid  used  for  this  calculation  follows  the  scheme  of 
Fig.  Ic,  with  the  cell  size  approximately  0.05  meters  on  a  side.  The  following  are  the  more  detailed  information 
for  these  calculations  (please  also  refer  to  Fig.  Ic): 

Ax  =  .05m; 

Incident  plane  wave  is  a  Gaussian  enveloped  sinusoidal  pulse; 

The  computational  region  has  lower  indices  (1,1)  and  upper  indices  (101,101); 

Point  1  has  the  indices  (31,31);  and  point  3  has  the  indices  (71,71);  the  center  of  the  circle  has 
the  indices  (51,51). 

The  number  of  damping  zones  in  the  outer  boundary  is  10;  and  the  tapered  damping  function  is 
cos{7t/3(10+l-ibd)/10}  (please  refer  to  Fig.  5a). 

We  start  our  caJculation  shortly  before  the  pulse  arrived  at  the  scallercr  (our  PEC  cylinder).  Calculation 
stops  when  the  time  signal  is  insignificant.  The  time  domain  data  for  the  electric  field  and  the  magnetic  field  in 
the  neighborhood  (two  zones)  of  the  scatterer  are  retained  and  Fourier  transformed  to  yield  frequency  domain 
data.  We  use  the  far  field  frequency  domain  representation  of  the  scattered  field  for  the  TM  case  for  r  »1  [19, 
p.374]: 
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^  c  ^ 


where  C  is  a  closed  curve  enclosing  the  scatterer;  «  is  the  unit  vector  in  the  direcuon  of  observation;  «  is 


the  unit  outward  pointing  normal  to  the  curve  C;  and  ^  is  the  point  of  integration.  Making  use  of 
dE^ 

g  =n’»VE^  and  the  Maxwell's  equations 
an 


^=jo)Hyand^=-j(0H^  ,(, 2) becomes 


^  exp(-yA:r22r  fjexpCyAjiJ*  n')exp(y/:ji*  ?')Wc'  .  (13) 

z  f.l/2  A\_7dc  \  J  1 


where  ^ '  is  the  unit  tangent  along  the  curve  in  the  counter-clockwise. 
The  2-D  bistatic  RCS  is  defined  as 

I  .«2 


CJ=2.TC  lim 


.  |2 


(14) 


Similar  expression  to  (13)  can  be  derived  for  the  TE  case  and  the  RCS  defined  in  term  of  the  z-component  of  the 

scattered  magnetic  field.  1  .•  „rioi 

These  results  show  good  agreement  with  the  known  results  based  on  the  exact  senes  soluuon  llsj.  bor 
comparison  purpose  we  also  show  the  calculauons  with  the  stair-casing  FDTD  code.  The  improvement  of  our 
conformal  algorithm  over  the  stair-casing  FDTD  for  this  circular  cylinder  is  self-evident. 


We  have  shown  that  an  easy  to  implement,  conformal  finite  difference  technique,  based  on  the  integral 
forms  of  Ampere's  and  Faraday's  laws,  can  now  be  used  to  calculate  the  scattering  from  a  variety  of  2-p  objects. 
The  conformal  grids  for  these  objects  are  produced  in  a  relatively  simple  manner,  adding  to  the  i^tenual  uUUiy 
of  this  technique.  (Even  objects  over  a  ground  plane  can  be  modeled  easily.)  The  method  includes  a  new  RBC 
that,  not  only,  is  easy  to  implement  but  seems  to  provide  excellent  "absorption”  of  the  scattered  wave  at  the  outer 

computational  boundary.  .  .  • 

The  data  generated  by  the  code  is  presentable  in  a  variety  of  ways,  including  RCS  (scattering  widths), 
field  amplitude  versus  time  plots  and  energy  density  maps.  Results  (RCS)  for  a  circul^  cylinder  show  good 
agreement  with  theory.  Energy  density  displays  for  two  objects  have  also  been  shown.  Displays  of  this  type  can 
provide  a  wealth  of  qualitative  (and  perhaps  quantitative)  information  on  the  nature  of  the  scattenng  process. 
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Fig.  1c  The  grid  for  a  circular  cylinder 


Fioure  2b.  The  base  ol  the  prism  to  update  the  magnetic  field 
vectof  with  the  FVTD  algorithm  (TM  case) 


Figure  2a.  Tha  contour  to  update  the  electric  fleld 
with  the  FDTD  algorithm  [TM  case) 


Figure  3a  The  contour  to  update  tha  magnetic  field  with 
the  FDTO  algorithm  (TE  case) 


Ftaure  3b.  The  base  of  the  prism  to  update  t^electric  fletd 
vector  with  tha  FVTD  algorithm  (TE  case) 


Figure  4. 
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TapObd)  =  cos{ji/3(N  +  l  -  Ibd  )/N) 

If  a  point  (j,j,k)  does  not  belong  to  any  zone,  set  Tab(i.j,k)  =>  1 
If  (l.j.k)  belongs  to  the  zone  Ibd.  set  Tabfi.j.k)  -  Tap(ibd)  1  s  ibd  <  N 


Figure  5a.  Tapered  damping  zones  at  the  outer  boundary 
rectangular  damping  zones 
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Figure  5b.  Tapered  damping  zones  at  the  outer  boundary 
non-rectangular  damping  zones 
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b.  Time  index  ~4S 


Figure  7.  Gaussian  pulse  incident  on  a  faceted  object 
sitting  above  a  ground  plane 
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Radius=lm,  Ax=0.05m,  At=Ax/(2*Vc),  f=3{)0MHz ,  TM  case 


Figure  8a.  Bistatic  RCS  for  a  PEC  Circular  Cylinder  (TM) 


Radius=Im,  Ax=0.05m.  At=Ax/(2*Vc),  f=300MHz ,  TE  case 


Figure  8b.  Bistatic  RCS  for  a  PEC  Circular  Cylinder  (TE) 
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Abstract 

Spectral  methods,  in  conjunction  with  domain  decomposition  techniques,  are  shown  to  be  powerful 
candidates  for  the  electromagnetic  analysis  of  electrically  long  structures  using  differential  equation-based 
formulations.  Through  the  modeling  of  simple  problems  with  known  analytic  solution,  spectral  methods 
are  shown  to  exhibit  the  exponential  accuracy  required  for  the  solution  of  structures  spanning  hundreds  of 
wavelengths  in  at  least  one  dimension.  Furthermore,  it  is  shown  how  such  accuracy  can  be  obtained  with 
operation  costs  and  storage  much  lower  than  those  required  of  standard  finite  element  approximations  to 
achieve  such  accuracy. 


1,  Iiitrodiictioii 

Domain  decomposition  techniques  have  been  proposed  recently  for  the  modeling  of  electromagnetic 
wave  interactions  with  electrically  large  structures.  One  specific  domain  decomposition  approach  splits  the 
structure  into  smaller  regions  in  which  numerical  solutions  arc  computed  separately  with  “basis”  excitations 
on  the  partitions  between  the  regions  [1,2].  Then,  field  continuity  at  the  partitions  is  invoked  to  obtain  the 
solution  in  the  entire  structure  subject  to  a  specific  excitation.  The  key  merits  of  such  an  approach  are:  a) 
its  inherent  parallelism,  making  it  ideal  for  implementation  on  massively  parallel  machines;  and  b)  the  fact 
that  the  solutions  in  each  region  can  be  developed  using  the  most  suitable  numerical  technique  (analytical, 
differential-equation  based,  or  integral-equation  based). 

Unfortunately,  it  can  be  shown  that  for  those  regions  for  which  finite  techniques  (finite  element  or  finite 
difference)  are  required,  the  numerical  discretization  needs  to  use  a  number  of  nodes  per  wavelength  dictated 
by  the  electrical  size  of  the  entire  structure  in  order  to  keep  numerical  dispersion  at  a  minimum.  The  source 
of  this  difficulty  is  the  numerical  dispersion  present  in  discrete  approximations  of  wave  phenomena.  More 
specifically,  it  was  shown  in  [3]  that  the  relative  rms  discretization  error  in  the  finite  element  approximation 
of  Helmholtz’s  equation  is: 

\\e\\Q<C{hkr'-^^{kd)  (1) 

where  m  is  the  degree  of  the  interpolating  polynomials,  h  is  the  grid  size,  d  is  the  characteristic  dimension 
of  the  discretized  domain,  k  is  the  wavenumber,  and  C  is  a  constant  dependent  on  geometry  and  boundary 
conditions.  The  important  implication  of  (1)  is  that  as  the  electrical  size  of  the  domain  increases,  the 
discretization  error  increases.  This  result,  as  well  as  the  impact  of  boundary  conditions,  was  demonstrated 
also  in  [4,5].  To  illustrate  its  importance,  consider  a  200A  x  2A  x  2A  structure.  From  [4],  a  resolution  of 
30  nodes  per  wavelength  would  be  necessary  for  acceptable  accuracy  if  quadratic  finite  elements  are  used. 
With  the  structure  decomposed  into  100  2A  x  2A  x  2A  regions,  the  number  of  unknowns  in  each  region  will 
be  648,000! 

In  an  attempt  to  deal  with  this  issue  we  examine  in  this  paper  the  advantages  that  the  so-called  spectral 
methods  offer  in  improving  the  accuracy  of  differential-equation  based  wave  simulations.  Spectral  methods 
have  been  used  succesfuliy  over  the  past  twenty  years  for  improving  the  numerical  accuracy  of  numerical 
simulations  of  fluid  dynamical  problems.  An  excellent  presentation  of  the  fundamentals  behind  spectral 
methods,  a.s  well  as  a  complete  bibliography  up  to  the  end  of  the  past  decade  can  be  found  in  [6]. 
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To  present  the  fundamentals  of  spectral  methods,  we  concentrate  on  simple  implementations  of  spectral 
approximations  of  the  scalar  Helmholtz  equation  in  one  and  two  dimensions.  Boundary  value  problems 
with  known  analytic  solutions  are  solved  in  order  to  demonstrate  the  exponential  accuracy  of  spectral  meth¬ 
ods.  Their  implementation  in  conjunction  with  domain  decomposition  techniques  is  examined  also.  We 
continue  with  a  discussion  of  the  basic  procedures  necessary  for  the  computationally  efficient  implementa¬ 
tion  of  spectral  methods.  Finally,  we  identify  specific  areas  of  further  research  toward  the  establishment  of 
spectral  methods  as  legitimate  candidates  for  the  numerical  modeling  of  electromagnetic  wave  phenomena 
in  electrically  large  structures. 

2.  Spectral  Approximation  of  the  Helmholtz  Equation 

Spectral  methods  may  be  thought  of  as  the  extension  of  the  method  of  separation  of  variables  to  the 
solution  of  non-separable  problems.  The  underlying  idea  is  to  represent  the  solution  to  a  problem  as  a 
truncated  series  of  smooth  functions  of  the  independent  variables.  The  critical  point  in  this  representation  is 
that  the  functions  used  are  such  that  the  expansion  converges  faster  than  algebraically  (i.e.  the  error  caused 
by  truncating  the  expansion  after  N  terms  goes  to  zero  faster  than  any  finite  power  of  1/A^  as  N  — ►  oo). 
Furthermore,  this  exponenital  rate  of  convergence  is  not  dependent  on  the  boundary  conditions;  it  dependes 
only  on  the  smoothness  of  the  solution. 

As  shown  in  [6],  expansions  in  terms  of  eigenfunctions  of  singular  Sturm-Liouville  problems  exhibit  such 
exponential  convergence.  Eigenfunctions  singular  Sturm-Liouville  problems  include  the  families  of  Chebyshev 
polynomials,  Legendre  polynomials,  Laguerre  polynomials,  etc.  [7].  Expansions  of  the  unknown  solution 
using  these  classes  of  orthogonal  polynomials  are,  then,  expected  to  converge  exponentially  independently  of 
the  type  of  boundary  conditions  imposed  on  the  boundaries  of  the  computational  domain.  This  convergence 
behavior  should  be  contrasted  to  Fourier  series  representations  of  functions.  It  is  well  known  that  the  Fourier 
series  representation  of  f(x)  in  the  domain  0  <  x  <  2k  converges  rapidly  only  if  the  function  is  both  smooth 
and  periodic. 

To  present  the  fundamentals  of  spectral  methods,  let  us  consider  the  following  one-dimensional  boundary 
value  problem: 

^  +  k^(x)[/  =  0,  0<x<L;  U(0)  =  0,  t/(L)  =  1  (2) 

where  =  u^^(x)((x).  In  (2)  it  is  assumed  that  the  material  properties  c{z)  and  ^(x)  are  smooth 

functions  of  position  so  that  [/(x)  6  C'".  However,  it  must  be  mentioned  that  for  those  cases  where  e(x) 
and/or  p{x)  exhibit  discontinuities,  spectral  methods  are  still  applicable.  For  example,  in  the  spirit  of  domain 
decomposition  techniques,  a  media  interface  across  which  c  and/or  /i  exhibit  a  jump  is  used  as  a  geometry 
decomposition  boundary.  With  the  change  of  variables  ^  =  2x/L  —  1,  the  domain  0  <  x  <  L  is  mapped  onto 
the  domain  -1  <  ^  <  1-  The  unknown  field  t/(0  is  represented  as  the  Lagrangian  interpolant  through  the 
N  +  I  points 

—  n  =  0,l,...,N  (3) 

For  this  selection  of  points,  the  interpolation  polynomials  turn  out  to  be  the  Chebyshev  polynomials,  Tn(^)  = 
cos(7icos~'  ^).  More  specifically,  the  approximation,  U,  of  U  takes  the  form 


where 


withl/j  j  =  0,l,2. 


(>({)  = 


_  2  5^ 

"  iV  c„  ^ 


;=0 


,  N  and  c,>  defined  as  follows: 

_  /  1,  if  n  yt  0,;V; 
2,  otherwise. 


(4) 

(5) 

(6) 
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Even  though  there  are  several  ways  of  obtaining  a  discrete  approximation  to  (2)  in  the  spirit  of  Galerkin’s 
approach,  we  choose  the  collocation  approach  (often  called  pseudospectral  approximation)  for  this  example. 
More  specifically,  we  require  that  the  equation  is  satisfied  exactly  at  the  interpolation  points  in  (3). 
Toward  this,  we  need  the  approximation  of  the  second  derivative  of  U  at  the  collocation  points.  Using  the 


notation 


N 


(7) 


to  represent  the  pth  derivative  of  U  at  ^  direct  differentiation  of  Chebyshev  polynomials  results  in  the 

following  expressions: 

Cj  U  -  i,j 


(8a) 


c.  2  +  1 

=  (U,)oo  =  — = 


(86) 


and 

D,^{Dyf  (9) 

In  Section  4  we  will  show  that  this  direct  calculation  of  the  matrix  representation  of  the  derivative  operator  is 
not  necessary,  and  a  more  computationally  ofTicient  procedure  for  calculating  the  derivatives  at  the  collocation 
points  can  be  effected. 

Using  (4), (5), (8)  and  (9)  with  p  =  2  in  (2),  a  collocation  procedure  leads  to  a  linear  system  of  equations 
for  the  unknown  coefficients  IJj.  From  (8)  it  is  clear  that  the  resulting  matrix  is  full.  Consequently,  the  value 
of  N  required  for  good  numerical  accuracy  becomes  an  issue  of  concern.  Clearly,  the  issue  here  is  to  obtain 
a  relationship  of  the  form  of  (1)  for  the  number  of  expansion  functions  (or  the  number  of  collocation  points) 
per  wavelength  required  to  achieve  a  desirable  degree  of  accuracy.  Toward  this  objective,  the  following  result 
is  useful:  ^ 

siri(A47r^  +  i/’)  =  2  ^  ^Jn(A/7r)sin  T„(^),  (10) 


—  1  <  ^  <  1,  where  Jn{z)  is  the  Bes.sel  function  of  order  n.  Recalling  the  asymptotic  form  Jn(z)  ~ 
(l/>/2xn)(ez/n)'’  as  ri  — ►  oo  [8],  it  is  clear  that  J„(Af7r)  0  exponentially  fast  as  n  >  A/tt.  This  result 
suggests  that  Chebyshev  approximations  to  time-harmonic  wave  solutions  will  start  converge  rapidly  when 
the  number  of  polynomials  retained  per  wavelength  is  greater  than  tt.  In  other  words,  a  heuristic  rule  for 
the  resolution  requirements  of  Chebyshev  expansions  is  at  least  four  collocation  points  per  wavelength. 

In  Figure  1  we  plot  the  Loo  error,  yU  -  U||k,,  in  the  solution  of  problem  (2)  versus  the  number  of 
collocation  points  for  a  homogeneous  region  with  h  =  27r.  The  length  of  the  domain  (in  wavelengths)  is  used 
as  a  parameter.  In  addition  to  supporting  the  aforementioned  sampling  rule,  the  plots  illustrate  clearly  the 
exponential  convergence  of  the  spectral  approximation.  For  example,  for  the  case  L  =  9.7A,  7  polynomials 
per  wavelength  resulted  in  accuracy  better  than  10”^.  In  comparison,  a  linear  finite  element  solution  of  this 
problem  with  a  resolution  of  80  nodes  per  wavelength  (i.e.,  ~  780  degrees  of  freedom)  exhibits  an  error  of 
~  10-3. 

The  extension  to  two  dimensions  is  straightforward.  For  example,  consider  the  scalar  Helmholtz  equation 
for  the  electric  field  E(x,  y)  =  zE{x,  y)  associated  with  the  two-dimensional  TM  modeling  of  electromagnetic 
wave  phenomena  in  a  z-independent  medium 


dx 


d 

^  dy 


+  4v^f(x,  rj)E  =  0 


(11) 


over  the  rectangular  domain  (0<x<Li,0<i/<  Lo)  with  Dirichlet  boundary  conditions  E{x  =:  0,y)  =  0, 
E{x  =  Li,y)  =  Eoiy),  E{x,y  =  0)  =  0,  E(x,y  =  L^)  -  0.  With  the  change  of  variables  ^  =  (2/Li)a:  -  1, 
y  =  [2f  L2)y  —  1  the  domain  is  transformed  to  the  square  region  (—1  <4<l,l<q<l)  and  (11)  becomes: 


d  f  1  dE\  d  /  1 

[LMtrj)  dU  dy  \Ll,i{Ey)  dn)^  4 


(12) 
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The  spectral  approximation  of  the  field  is  written  as: 


£(«.>?)=  EE  ^tnnTrn  iOTnM 


-EE- 


and  Eij  are  the  field  values  at  the  (i,i)th  node  of  the  grid  0  r)j;  i=0,l,2,. M,  j=0,1.2,.  •  ■ ,  with 
=  cos(i5r/M),  Tjj  =  cos(j7r/A^).  As  in  the  ID-case,  the  interpolation  points  are  used  as  collocation  points 
also  for  a  pseudospectral  approximation  of  the  equation. 

B'or  a  numerical  example  in  two  dimensions,  we  used  the  boundary  condition 

E{x  =  Li,y)-  Eoiy)  -  sin(2ry/L2) 

for  the  boundary- value  problem  of  (11)  with  /i  =  /io,  f  =  fo  and  A  =  1.  Thus,  we  simulated  a  standing  wave 
pattern  for  the  first  TEj,  mode  in  a  shorted  parallel-plate  waveguide.  For  the  case  L2  =  4A,  Li  =  5.3A,  a 
32  X  32  spectral  approximation  (~  7  polynomials  per  wavelength)  exhibits  an  error  of  j|F?  —  ^jjoo  ~  10“^,  in 
agreement  with  the  resolution  rule  mentioned  above. 

3.  Spectral  Metliods  and  Domain  Decomposition 

To  examine  the  performance  of  spectral  methods  in  conjunction  with  domain  decomposition  techniques, 
we  consider  the  following  problem.  The  one-dimensional  Helmholtz  equation  is  solved  in  a  homogeneous 
domain  with  t  =  27r  (or  A  =  1)  of  length  L,„  with  Dirichlet  boundary  conditions  [/(O)  =  0  and  =  1. 

To  develop  the  solution,  the  domain  is  decomposed  in  m  subdomains,  each  of  length  L  =  3.7A.  (For  this 
special  geometry  all  subdomains  are  the  same.)  Thus,  =  mL.  For  m  subdomains  there  exist  m  —  1 
subdomain  boundaries  which  effect  the  decomposition  of  the  domain.  Let  Qj,  i  =  1,  2, . . . ,  m  —  1,  be  the 
value  of  U  at  these  boundaries.  Let  Qo  =;  {/(O)  =  0  and  Qm  —  U{Lm)  =  1-  Within  each  subdomain  we  solve 
the  following  two  problems: 


-  +  k^Ua=0,  0<t<L-  f/c(0)  =  0, 


+  k~Ut  =  0,  0<t<L-  Ub{0}  =  1,  UiiL)  =  0  (16) 

Thus,  the  solution,  in  the  \th  subdomain  can  be  written  in  terms  of  Ua,  Ut  and  the  values  of  U  at 
the  subdomain  interfaces  f  —  1  and  i  as: 

f/('>(i)  =  Qi-iUi  +  QiUa,  (i  -  1)L  <  X  <  iL  (17) 

Enforcement  of  the  continuity  of  the  normal  derivative  across  domain  decomposition  boundaries  results 
in  a  linear  system  of  equations  for  the  unknowns  Q,-,  i  =  1,2, ..  .,m  -  1.  We  note  here  that  the  spectral 
representation  of  the  unknown  field  allows  for  an  “exact”  (within  the  approximation  of  the  truncation  of  the 
expansion)  calculation  of  the  derivative  even  at  the  boundary  points.  This  implies  that  the  normal  derivative 
continuity  condition  is  also  enforced  witli  spectral  accuracy. 

In  Figure  2  the  error  is  plotted  versus  the  number  of  subdomains  M.  A  comparison  is  made 
between  a  pseudospectral  approximation  (lower  cluster  of  points)  and  a  linear  finite  element  approximation 
(upper  cluster  of  points).  For  the  pseudospectral  approximation,  32  grid  points  were  used  within  the  3.7A 
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subdomain  for  the  numerical  solution  for  Ua  and  Ub-  This  resulted  in  a  ~  10"^^  accuracy  in  the  numerical 
field  and  ~  10“*”  accuracy  in  the  numerical  derivative  calculated  at  the  decomposition  boundaries.  For 
the  linear  finite  element  approximation,  a  resolution  of  100  nodes  per  wavelength  was  used,  which  resulted 
in  a  ~  10“^  accuracy  in  the  numerical  derivatives  at  the  decomposition  boundaries.  Clearly,  the  error  of 
the  finite  element  solution  becomes  unacceptable  beyond  the  case  of  M  >  15,  assuming  that  an  Leo  error 
of  less  than  0.1  is  desirable.  (For  the  wave  simulations  studied  in  this  paper,  an  Loo  error  of  ~  0.1  would 
result  in  an  L7  error  better  than  10“^.)  On  the  other  hand,  the  error  in  the  pseudospectral  approximation 
is  negligible  even  for  the  case  of  M  =  49  or  a  domain  ~  180A  long. 

4.  Efficient  Numerical  Implementation  of  Spectral  Methods 

Spectral  approximations  result  in  full  matrices,  a  highly  undesirable  property  for  the  solution  of  2D  and 
3D  problems.  For  example,  for  a  true  3D  application  of  spectral  techniques,  a  32  x  32  X  32  grid  will  result  in 
a  full  matrix  of  dimension  ~  lO”!  However,  using  the  fast  transform  properties  of  Chebyshev  polynomials, 
and  in  conjunction  with  appropriate  iterative  techniques,  one  can  solve  the  spectral  equations  with  operation 
costs  and  storage  comparable  to  those  of  standard  finite  element  approximations  to  the  problem  with  the 
same  degrees  of  freedom  [6]. 

To  illustrate  this  point,  let  Kgp  be  the  matrix  resulting  from  the  spectral  approximation  of  the  ID 
problem  in  Section  2,  In  matrix  form,  the  resulting  linear  system  of  equations  is  K*pU  =  f,  where  U  is  the 
vector  of  unknowns  Oj ,  and  f  is  the  forcing  vector.  Also,  let  Kf  be  the  matrix  resulting  from  a  finite  element 
or  a  finite  difference  approximation  of  the  same  problem  using  the  same  number  of  degrees  of  freedom.  Kf 
is  then  used  as  a  preconditioner  for  an  iterative  solution  of  the  problem: 

=  0^”)  -  Kf-’  (KspU^'’^ -f)  ,  n=:0, 1,...  (18) 

Since  the  number  of  degrees  of  freedom  in  spectral  approximations  is  small,  the  sparse  matrix  Kf  can  be 
inverted  efficiently.  Therefore,  most  of  the  computational  labor  in  (18)  is  associated  with  the  multiplication 
KgpCf("K  However,  this  multiplication  can  be  effected  in  0(N  log^  N)  operations  instead  of  0{N^)  operations 
as  explained  next. 

Notice  that  we  can  write  K^p  =  Dsp+w‘P,  where  Dgp  is  the  matrix  representation  of  the  differentiation 
operations  in  Helmholtz’s  operator  and  P  is  a  diagonal  matrix.  Thus,  the  operation  DgpU^"l  effects  the 
calculation  of  appropriate  derivatives  at  the  N  collocation  points.  Instead  of  direct  multiplication,  this 
calculation  can  be  done  using  Fourier  transforms.  This  becomes  immediately  clear  if  the  result  r„(^j  = 
cos(rrj7A’))  =  cos(7rjn/A')  is  used  in  (5)  to  cast  it  in  the  form: 


a. 


11. 

N  c„ 


TTj  n 

''IT' 


(19) 


From  (19)  it  is  clear  that  the  coefficients  a„  can  be  calculated  using  the  FFT  in  0{N  \0g2N)  operations 
(assuming  N  is  a  power  of  2).  Once  the  a„ ’s  are  available,  the  derivatives  of  order  p  at  the  collocation  points 
can  be  calculated  using  the  following  result  [6] 


d^P 


=  0,1,2,...,  yV 

N 


where,  6^'^^  =  a,i,  (0  <  n  <  N),  and  for  for  p  >  0  it  is 

=  bll.  +  2(n  +  1  \  (0  <  n  <  iV  -  2) 


(20) 


(21a) 

(216) 
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From  (20)  the  calculation  of  the  derivative  can  be  effected  in  0(N  log^  N)  operations  using  the  FFT,  while 
from  (21)  the  calculation  of  the  pth  derivative  requires  pN  calculations.  Therefore,  the  iterative  process  in 
(19)  is  an  0{N  logj  N)  process. 

The  extension  of  the  aforementioned  arguments  to  two  and  three  dimensions  is  straightforward.  How¬ 
ever,  it  is  appropriate  to  recall  here  the  fact  that  for  multiple  dimensions  most  of  the  computational  efficiency 
of  transform  methods  comes  not  from  the  FFT  but  from  the  separability  of  multidimensional  transforms. 
For  example,  for  the  two-dimensional  case  with  an  M  x  N  Chebyshev  grid,  the  aforementioned  matrix 
multiplication  requires  about  NM(log2  N  +  logj  M)  operations  if  N  and  M  are  powers  of  2. 

5,  Conclusions 

In  summary,  this  paper  has  reviewed  the  fundamentals  of  spectral  methods  and  has  demonstrated  the 
exponential  accuracy  they  can  provide  to  the  numerical  solution  of  elliptic  boundary  value  problems  associ¬ 
ated  with  wave  phenomena.  This  exponential  (or  spectral)  accuracy  is  essential  for  the  numerical  modeling 
of  wave  interactions  in  structures  that  span  hundreds  of  wavelengths.  It  was  shown  that,  taking  advantage  of 
the  availability  of  fast  transforms  for  spectral  eigenfunction  expansions  and  using  finite-difference  or  finite- 
element  preconditioning,  highly  accurate  solutions  can  be  generated  with  computational  cost  slightly  higher 
than  that  required  for  a  finite  element  solution  of  the  problem  using  the  same  number  of  degrees  of  freedom. 
However,  for  the  finite  element  method  alone  to  achieve  the  accuracy  of  the  spectral  methods,  it  will  require 
higher-order  interpolation  functions,  and  the  number  of  degrees  of  freedom  in  each  dimension  will  need  to 
be  about  an  order  of  magnitude  larger  than  that  for  the  spectral  approximation. 

Future  work  will  concentrate  on  the  implementation  of  the  spectral  method  to  non-rectangular  geome¬ 
tries  and  three-dimensional  problems.  In  addition,  the  implementation  of  truncation  boundary  conditions  in 
the  spectral  approximation  of  radiation  and  scattering  problems  will  be  considered. 
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error  versus  number  of  collocation  points  for  the  pseudospectral  solution  of  the  one- 
holtz  equation  with  Dircichlet  boundary  conditions.  The  length  of  the  domain  (in  wave- 
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Figure  2.  Loo  error  versus  number  of  subdomains  for  the  solution  of  the  one-dimensional  Helmholtz 
equation.  The  top  cluster  of  points  is  for  a  linear  finite  element  solution  of  the  problem  with  370  degrees 
of  freedom  per  subdomain.  The  bottom  cluster  of  points  is  for  a  pseudospectrai  solution  with  32  degrees  of 
freedom  per  subdomain. 
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